5664 Commits

Author SHA1 Message Date
antirez
08db70928b Cluster: clarify node->slave may be NULL. 2016-01-25 15:20:05 +01:00
antirez
bcc556eb5b Cluster: fix rebalancing to always empty nodes.
Because of rounding error even with weight=0 sometimes a node was left
with an assigned slot.

Close #3001.
2016-01-25 15:20:01 +01:00
antirez
557e7c3af3 Cluster: redis-trib move_to_slot: don't send SETSLOT to slaves. 2016-01-25 15:19:57 +01:00
antirez
18f62b127b Cluster: fix redis-trib reference of variable in warning. 2016-01-25 15:19:54 +01:00
antirez
ace53e899a CLUSTER BUMPEPOCH initial implementation fixed. 2016-01-25 15:19:50 +01:00
antirez
eb2d95a47c Cluster: implement redis-trib fix when slot is open without owners.
Still work to do.
2016-01-25 15:19:47 +01:00
antirez
874ad0cbdf Cluster: implement redis-trib fix for uncovered slots. 2016-01-25 15:19:47 +01:00
antirez
90a79bc1b6 Cluster: CLUSTER BUMPEPOCH introduced to help redis-trib fix.
Sometimes during "fixes" we have to setup a new configuration and assign
slots to nodes. With BUMPEPOCH we can make sure the new configuration of
the node will win if there are conflicting configurations (for example
another node is *also* claiming the same slot because the cluster is
totally messed up).
2016-01-25 15:19:35 +01:00
antirez
6ce536db49 Cluster: don't allow CLUSTER SETSLOT with slaves. 2016-01-25 15:19:32 +01:00
antirez
a7ec6d1ffa Cluster: check packets length before accessing far fields. 2016-01-19 13:18:05 +01:00
antirez
5fd61c9558 Scripting: handle trailing comments.
This fix, provided by Paul Kulchenko (@pkulchenko), allows the Lua
scripting engine to evaluate statements with a trailing comment like the
following one:

    EVAL "print() --comment" 0

Lua can't parse the above if the string does not end with a newline, so
now a final newline is always added automatically. This does not change
the SHA1 of scripts since the SHA1 is computed on the body we pass to
EVAL, without the other code we add to register the function.

Close #2951.
2016-01-08 15:45:13 +01:00
antirez
d975baa35b Allow MIGRATE to always be called on local keys for open slots.
Extend the MIGRATE extra freedom to be able to be called in the context
of the local slot, anytime there is a slot open in one or the other
direction (importing or migrating). This is useful for redis-trib to fix
the cluster when it has in an odd state.

Thix fix allows "redis-trib fix" to make its work in certain cases where
previously an error was reported.
2016-01-08 15:35:32 +01:00
antirez
515bbdfcdd Fix typos & grammar in clusterBumpConfigEpochWithoutConsensus() comment. 2016-01-08 15:35:28 +01:00
antirez
6cbd559679 Lua debugger: support direct calls to SCRIPT DEBUG in redis-cli.
Previously it was possible to activate a debugging session only using
the --ldb option in redis-cli. Now calling SCRIPT DEBUG can also
activate the debugging mode without putting the redis-cli in a
desynchronized state.

Related to #2952.
2016-01-08 15:35:23 +01:00
antirez
8cc1a49edf Lua debugger: fix crash printing nested or deep objects.
Example of offending code:

> script debug yes
OK
> eval "local a = {1} a[1] = a\nprint(a)" 0
1) * Stopped at 1, stop reason = step over
2) -> 1   local a = {1} a[1] = a
> next
1) * Stopped at 2, stop reason = step over
2) -> 2   print(a)
> print

... server crash ...

Close #2955.
2016-01-08 15:35:18 +01:00
antirez
d256abe9c0 Another typo in protected mode error message. 2016-01-08 15:35:14 +01:00
antirez
068461521f Fix protected mode error message typo. 2016-01-08 15:35:09 +01:00
antirez
273c49e726 New security feature: Redis protected mode.
An exposed Redis instance on the internet can be cause of serious
issues. Since Redis, by default, binds to all the interfaces, it is easy
to forget an instance without any protection layer, for error.

Protected mode try to address this feature in a soft way, providing a
layer of protection, but giving clues to Redis users about why the
server is not accepting connections.

When protected mode is enabeld (the default), and if there are no
minumum hints about the fact the server is properly configured (no
"bind" directive is used in order to restrict the server to certain
interfaces, nor a password is set), clients connecting from external
intefaces are refused with an error explaining what to do in order to
fix the issue.

Clients connecting from the IPv4 and IPv6 lookback interfaces are still
accepted normally, similarly Unix domain socket connections are not
restricted in any way.
2016-01-08 15:35:06 +01:00
antirez
b18e42b23e Cluster: don't send -ASK to MIGRATE.
For non existing keys, we don't want to send -ASK redirections to
MIGRATE, since when moving slots from the migrating node to the
importing node, we want just to ignore keys that are no longer there.
They may be expired or deleted between the GETKEYSINSLOT call and the
MIGRATE call. Otherwise this causes an error during migrations with
redis-trib (or equivalent cluster management tools).
2016-01-06 12:18:55 +01:00
antirez
2ebfd0f7c5 Cluster test: do leaks detection with OSX leaks utility. 2016-01-02 13:25:20 +01:00
antirez
13a70eb684 redis-trib: Remove duplicated key in hash initialization. 2016-01-02 13:25:20 +01:00
antirez
e4f1994bd1 Cluster/Sentinel test: report ability to run via valgrind. 2015-12-29 15:27:52 +01:00
antirez
bb3ed7e2d0 Changelog for 3.2 release: more details and credits. 2015-12-23 16:06:30 +01:00
antirez
3955fdee60 Redis 3.1.101 (Redis 3.2.0 RC1). 3.2-rc1 2015-12-23 13:35:32 +01:00
antirez
9cd1cd6680 Test: improve PFCOUNT with multiple keys testing.
An user raised a question about a given behavior of PFCOUNT. Added a
test to show the behavior (union) is correct when most of the items are
in common.
2015-12-23 12:43:08 +01:00
Paul Kulchenko
db1df45494 Update pretty printing in debugging to generate valid Lua code for userdata-like types. 2015-12-22 09:01:54 +01:00
Paul Kulchenko
c428066af2 Update pretty printing in debugging to generate valid Lua code for tables. 2015-12-22 09:01:54 +01:00
antirez
b9baccb766 Cluster: rebalance now supports --threshold option. 2015-12-18 15:52:17 +01:00
antirez
39667f7e1a Cluster: redis-trib reshard / rebalance --pipeline support. 2015-12-18 15:52:17 +01:00
antirez
db035ecc44 Cluster: verify slaves consistency after resharding. 2015-12-18 11:34:05 +01:00
antirez
57f079a2ae Fix typo in prepareClientToWrite() comment. 2015-12-18 09:19:13 +01:00
antirez
cd29e7be27 Cluster: resharding test now checks AOF consistency.
It's a key invariant that when AOF is enabled, after the cluster
reshards, a crash-recovery event causes all the keys to be still fine
with the expected logical content. Now this is part of unit 04.
2015-12-17 17:53:29 +01:00
antirez
7a7e46b22f Fix a race that may lead to the active (slave) client to be freed.
In issue #2948 a crash was reported in processCommand(). Later Oran Agra
(@oranagra) traced the bug (in private chat) in the following sequence
of events:

1. Some maxmemory is set.
2. The slave is the currently active client and is executing PING or
   REPLCONF or whatever a slave can send to its master.
3. freeMemoryIfNeeded() is called since maxmemory is set.
4. flushSlavesOutputBuffers() is called by freeMemoryIfNeeded().
5. During slaves buffers flush, a write error could be encoutered in
   writeToClient() or sendReplyToClient() depending on the version of
   Redis. This will trigger freeClient() against the currently active
   client, so a segmentation fault will likely happen in
   processCommand() immediately after the call to freeMemoryIfNeeded().

There are different possible fixes:

1. Add flags to writeToClient() (recent versions code base) so that
   we can ignore the write errors, and use this flag in
   flushSlavesOutputBuffers(). However this is not simple to do in older
   versions of Redis.
2. Use freeClientAsync() during write errors. This works but changes the
   current behavior of releasing clients ASAP when possible. Normally
   we write to clients during the normal event loop processing, in the
   writable client, where there is no active client, so no care must be
   taken.
3. The fix of this commit: to detect that the current client is no
   longer valid. This fix is a bit "ad-hoc", but works across all the
   versions and has the advantage of not changing the remaining
   behavior. Only alters what happens during this race condition,
   hopefully.
2015-12-17 09:52:03 +01:00
antirez
f50dfff0ee Fix processCommand() comment about return value. 2015-12-17 09:51:58 +01:00
antirez
fc00042ef9 Hopefully better memory test on crash.
The old test, designed to do a transformation on the bits that was
invertible, in order to avoid touching the original memory content, was
not effective as it was redis-server --test-memory. The former often
reported OK while the latter was able to spot the error.

So the test was substituted with one that may perform better, however
the new one must backup the memory tested, so it tests memory in small
pieces. This limits the effectiveness because of the CPU caches. However
some attempt is made in order to trash the CPU cache between the fill
and the check stages, but not for the addressing test unfortunately.

We'll see if this test will be able to find errors where the old failed.
2015-12-17 09:51:54 +01:00
antirez
ef92f90d34 Suppress harmless warnings. 2015-12-17 09:51:48 +01:00
antirez
4fee390335 memtest.c now can be called as API in non interactive mode. 2015-12-17 09:51:45 +01:00
antirez
f034a075b9 Crash report format improvements. 2015-12-17 09:51:42 +01:00
antirez
72be072e24 Cluster: allows abbreviated node IDs with rebalance --weight option. 2015-12-17 09:51:38 +01:00
antirez
8bc75fc1e9 Cluster: rebalancing option --simulate, and a fix. 2015-12-17 09:51:35 +01:00
antirez
b8373fb482 Cluster: redis-trib rebalance initial implementation. 2015-12-17 09:51:31 +01:00
antirez
1ae686656a Initial implementation of redis-trib info subcommand. 2015-12-17 09:51:27 +01:00
antirez
d5b55bdf13 Log address causing SIGSEGV. 2015-12-15 18:01:02 +01:00
Sun He
6521a6b13b lua_struct.c/getnum: throw error if overflow happen
Fix issue #2855
2015-12-14 17:58:51 +01:00
antirez
87615a81ab Cluster: redis-trib: use variadic MIGRATE.
We use the new variadic/pipelined MIGRATE for faster migration.
Testing is not easy because to see the time it takes for a slot to be
migrated requires a very large data set, but even with all the overhead
of migrating multiple slots and to setup them properly, what used to
take 4 seconds (1 million keys, 200 slots migrated) is now 1.6 which is
a good improvement. However the improvement can be a lot larger if:

1. We use large datasets where a single slot has many keys.
2. By moving more than 10 keys per iteration, making this configurable,
   which is planned.

Close #2710
Close #2711
2015-12-13 10:09:18 +01:00
antirez
00353f9902 MIGRATE: Fix key extraction for new form. 2015-12-13 10:09:18 +01:00
antirez
f99be541b3 MIGRATE: test more corner cases. 2015-12-13 10:09:18 +01:00
antirez
884ce38bed MIGRATE: Fix new argument rewriting refcount handling. 2015-12-13 10:09:18 +01:00
antirez
2b74b9857b MIGRATE: fix replies processing and argument rewriting.
We need to process replies after errors in order to delete keys
successfully transferred. Also argument rewriting was fixed since
it was broken in several ways. Now a fresh argument vector is created
and set if we are acknowledged of at least one key.
2015-12-13 10:09:18 +01:00
antirez
d6bc17c254 Test: pipelined MIGRATE tests added. 2015-12-13 10:09:18 +01:00