4857 Commits

Author SHA1 Message Date
antirez
bd99b26bc5 Sentinel: remove useless sentinelFlushConfig() call
To rewrite the config in the loop that adds slaves back after a master
reset, in order to handle switching to another master, is useless: it
just adds latency since there is an fsync call in the inner loop,
without providing any additional guarantee, but the contrary, since if
after the first loop iteration the server crashes we end with just a
single slave entry losing all the other informations.

It is wiser to rewrite the config at the end when the full new
state is configured.
2015-05-04 12:55:27 +02:00
Yossi Gottlieb
0560738f6b Fix Redis server crash when Lua command exceeds client output buffer
limit.
2015-05-04 12:20:24 +02:00
clark.kang
88d58661db fix sentinel memory leak 2015-05-04 12:18:05 +02:00
antirez
315e3b14ef Fix Sentinel memory leak (hiredis bug)
This fixes issue #2535, that was actually an hiredis library bug (I
submitted an issue and fix to the redis/hiredis repo as well).

When an asynchronous hiredis connection subscribes to a Pub/Sub channel
and gets an error, and in other related conditions, the function
redisProcessCallbacks() enters a code path where the link is
disconnected, however the function returns before freeing the allocated
reply object. This causes a memory leak. The memory leak was trivial to
trigger in Redis Sentinel, which uses hiredis, every time we tried to
subscribe to an instance that required a password, in case the Sentinel
was configured either with the wrong password or without password at
all. In this case, the -AUTH error caused the leaking code path to be
executed.

It was verified with Valgrind that after this change the leak no longer
happens in Sentinel with a misconfigured authentication password.
2015-04-28 22:15:09 +02:00
antirez
7ff051f6c1 sha1.c: use standard uint32_t. 2015-04-27 12:07:59 +02:00
antirez
f387a5acf8 Old warning removed from release notes. 2015-04-01 17:34:22 +02:00
antirez
1fab07e078 Redis 3.0.0. 3.0.0 2015-04-01 16:01:44 +02:00
antirez
8ebae5d630 dict.c: remove dictGetRandomKeys() API, no longer used. 2015-04-01 15:50:54 +02:00
Salvatore Sanfilippo
21c3d77118 Merge pull request #2477 from asheldon/patch-1
2.8 is a subset of 3.0, not the converse.
2015-04-01 15:32:28 +02:00
antirez
60a28fad8a Net: improve prepareClientToWrite() error handling and comments.
When we fail to setup the write handler it does not make sense to take
the client around, it is missing writes: whatever is a client or a slave
anyway the connection should terminated ASAP.

Moreover what the function does exactly with its return value, and in
which case the write handler is installed on the socket, was not clear,
so the functions comment are improved to make the goals of the function
more obvious.

Also related to #2485.
2015-04-01 15:20:54 +02:00
antirez
e42baed4c3 Test: be more patient waiting for servers to exit.
This should likely fix a false positive when running with the --valgrind
option.
2015-04-01 15:20:54 +02:00
Oran Agra
aa67aec84e fixes to diskless replication.
master was closing the connection if the RDB transfer took long time.
and also sent PINGs to the slave before it got the initial ACK, in which case the slave wouldn't be able to find the EOF marker.
2015-04-01 15:20:54 +02:00
antirez
93959bc09f Sentinel / Cluster test: exit with non-zero error code on failures. 2015-03-30 14:29:18 +02:00
antirez
2e92d0f04a Test: regression for issue #2473. 2015-03-27 12:11:27 +01:00
antirez
adcb470130 dict.c: add casting to avoid compilation warning.
rehashidx is always positive in the two code paths, since the only
negative value it could have is -1 when there is no rehashing in
progress, and the condition is explicitly checked.
2015-03-27 10:10:39 +01:00
asheldon
1b71fea998 2.8 is a subset of 3.0, not the converse. 2015-03-26 13:41:00 -07:00
antirez
2b5cf6bf78 Redis 2.9.106 (3.0.0 Release Candidate 6) 3.0.0-rc6 2015-03-24 16:27:12 +01:00
antirez
7e78ab4b6f Replication: disconnect blocked clients when switching to slave role.
Bug as old as Redis and blocking operations. It's hard to trigger since
only happens on instance role switch, but the results are quite bad
since an inconsistency between master and slave is created.

How to trigger the bug is a good description of the bug itself.

1. Client does "BLPOP mylist 0" in master.
2. Master is turned into slave, that replicates from New-Master.
3. Client does "LPUSH mylist foo" in New-Master.
4. New-Master propagates write to slave.
5. Slave receives the LPUSH, the blocked client get served.

Now Master "mylist" key has "foo", Slave "mylist" key is empty.

Highlights:

* At step "2" above, the client remains attached, basically escaping any
  check performed during command dispatch: read only slave, in that case.
* At step "5" the slave (that was the master), serves the blocked client
  consuming a list element, which is not consumed on the master side.

This scenario is technically likely to happen during failovers, however
since Redis Sentinel already disconnects clients using the CLIENT
command when changing the role of the instance, the bug is avoided in
Sentinel deployments.

Closes #2473.
2015-03-24 16:16:44 +01:00
antirez
3468cd3664 Cluster: redirection refactoring + handling of blocked clients.
There was a bug in Redis Cluster caused by clients blocked in a blocking
list pop operation, for keys no longer handled by the instance, or
in a condition where the cluster became down after the client blocked.

A typical situation is:

1) BLPOP <somekey> 0
2) <somekey> hash slot is resharded to another master.

The client will block forever int this case.

A symmentrical non-cluster-specific bug happens when an instance is
turned from master to slave. In that case it is more serious since this
will desynchronize data between slaves and masters. This other bug was
discovered as a side effect of thinking about the bug explained and
fixed in this commit, but will be fixed in a separated commit.
2015-03-24 16:16:44 +01:00
superlogical
d1b5c5defd create-cluster fix for stop and watch commands 2015-03-24 16:16:44 +01:00
antirez
66899a42fc Cluster: unit 10 modified to leave cluster in proper state. 2015-03-22 23:00:38 +01:00
antirez
76b18c7a0e Cluster: CLUSTER FAILOVER TAKEOVER tests. 2015-03-22 23:00:38 +01:00
antirez
ca804a1022 Cluster: more tests for manual failover + FORCE. 2015-03-22 23:00:38 +01:00
antirez
d15d9fecd2 Cluster: new tests1 for manual failover and scripts replication. 2015-03-22 23:00:38 +01:00
antirez
c2717911db Cluster: fix Lua scripts replication to slave nodes. 2015-03-22 22:24:05 +01:00
antirez
1641f41cfc Two cluster.c comments improved. 2015-03-21 18:23:10 +01:00
antirez
b37b2b5c14 Cluster: TAKEOVER option for manual failover. 2015-03-21 18:23:06 +01:00
antirez
c43c970344 Net: processUnblockedClients() and clientsArePaused() minor changes.
1. No need to set btype in processUnblockedClients(), since clients
   flagged REDIS_UNBLOCKED should have it already cleared.
2. When putting clients in the unblocked clients list, clientsArePaused()
   should flag them with REDIS_UNBLOCKED. Not strictly needed with the
   current code but is more coherent.
2015-03-21 18:23:01 +01:00
antirez
b64c861171 Cluster: non-conditional steps of slave failover refactored into a function. 2015-03-21 18:22:46 +01:00
antirez
47bbaa17b0 Cluster: separate unknown master check from the rest.
In no case we should try to attempt to failover if myself->slaveof is
NULL.
2015-03-21 18:22:39 +01:00
antirez
0595420b1e Cluster: refactoring around configEpoch handling.
This commit moves the process of generating a new config epoch without
consensus out of the clusterCommand() implementation, in order to make
it reusable for other reasons (current target is to have a CLUSTER
FAILOVER option forcing the failover when no master majority is
reachable).

Moreover the commit moves other functions which are similarly related to
config epochs in a new logical section of the cluster.c file, just for
clarity.
2015-03-21 18:22:33 +01:00
antirez
2d34ec60bf Fix typo in beforeSleep() comment. 2015-03-21 09:19:51 +01:00
antirez
2d7d75adb3 Net: clientsArePaused() should not touch blocked clients.
When the list of unblocked clients were processed, btype was set to
blocking type none, but the client remained flagged with REDIS_BLOCKED.
When timeout is reached (or when the client disconnects), unblocking it
will trigger an assertion.

There is no need to process pending requests from blocked clients, so
now clientsArePaused() just avoid touching blocked clients.

Close #2467.
2015-03-21 09:15:14 +01:00
antirez
8dac5c8bb3 Redis 2.9.105 (3.0.0 Release Candidate 5) 3.0.0-rc5 2015-03-20 10:35:12 +01:00
antirez
62893f5b9f Cluster: better cluster state transiction handling.
Before we relied on the global cluster state to make sure all the hash
slots are linked to some node, when getNodeByQuery() is called. So
finding the hash slot unbound was checked with an assertion. However
this is fragile. The cluster state is often updated in the
clusterBeforeSleep() function, and not ASAP on state change, so it may
happen to process clients with a cluster state that is 'ok' but yet
certain hash slots set to NULL.

With this commit the condition is also checked in getNodeByQuery() and
reported with a identical error code of -CLUSTERDOWN but slightly
different error message so that we have more debugging clue in the
future.

Root cause of issue #2288.
2015-03-20 10:06:11 +01:00
antirez
585f68ac35 Cluster: move clusterBeforeSleep() call before unblocked clients processing.
Related to issue #2288.
2015-03-20 10:06:07 +01:00
antirez
d8236ea262 Cluster: more robust slave check in CLUSTER REPLICATE.
There are rare conditions where node->slaveof may be NULL even if the
node is a slave. To check by flag is much more robust.
2015-03-18 12:09:39 +01:00
antirez
6adf3ca798 Cluster: ignore various node files in create-cluster dir. 2015-03-18 11:29:33 +01:00
antirez
6ec87978de Added regression test for issue #2371. 2015-03-18 11:29:32 +01:00
antirez
d225e79b8d HAVE_SYNC_FILE_RANGE should be protected by ifdef __linux__.
Related to issue #2372.
2015-03-18 11:29:32 +01:00
Mariano Pérez Rodríguez
586211ae60 Fix for #2371
Fixing #2371 as per @mattsta's suggestion
2015-03-18 11:29:32 +01:00
antirez
51a0ee1ed7 Norrow backtrace and setproctitle() to Linux+glibc.
Backtrace is a glibc extension, while setproctitle() implementation
depends on the memory layout and is partially libc dependent.
2015-03-18 11:29:32 +01:00
antirez
e772c8e7eb redis-cli --latency-dist: one gray more, and --mono support. 2015-03-18 11:29:32 +01:00
antirez
99764d2722 redis-cli --latency-dist, hopefully better palette.
Less grays: more readable palette since usually we have a non linear
distribution of percentages and very near gray tones are hard to take
apart. Final part of the palette is gradient from yellow to red. The red
part is hardly reached because of usual distribution of latencies, but
shows up mainly when latencies are very high because of the logarithmic
scale, this is coherent to what people expect: red = bad.
2015-03-18 11:29:32 +01:00
Masahiko Sawada
b5bd8f6afe Unify to uppercase the headline 2015-03-18 11:29:32 +01:00
Michel Martens
f36482dd5f Add command CLUSTER MYID 2015-03-18 11:29:32 +01:00
antirez
085ef2087b Net: better Unix socket error. Issue #2449. 2015-03-18 11:29:32 +01:00
Leandro López (inkel)
b10c2b7bb2 Support CLIENT commands in Redis Sentinel
When trying to debug sentinel connections or max connections errors it
would be very useful to have the ability to see the list of connected
clients to a running sentinel. At the same time it would be very helpful
to be able to name each sentinel connection or kill offending clients.

This commits adds the already defined CLIENT commands back to Redis
Sentinel.
2015-03-13 18:29:59 +01:00
antirez
f5d9e3f715 Config: activerehashing option support in CONFIG SET. 2015-03-08 15:33:49 +01:00
antirez
45ff739cdf Fix iterator for issue #2438.
Itereator misuse due to analyzeLatencyForEvent() accessing the
dictionary during the iteration, without the iterator being
reclared as safe.
2015-03-04 11:49:48 -08:00