93 Commits

Author SHA1 Message Date
antirez
98682cd178 Log to what master a slave is going to connect to. 2013-11-11 09:25:41 +01:00
antirez
df0c96002d Replication: install the write handler when reusing a cached master.
Sometimes when we resurrect a cached master after a successful partial
resynchronization attempt, there is pending data in the output buffers
of the client structure representing the master (likely REPLCONF ACK
commands).

If we don't reinstall the write handler, it will never be installed
again by addReply*() family functions as they'll assume that if there is
already data pending, the write handler is already installed.

This bug caused some slaves after a successful partial sync to never
send REPLCONF ACK, and continuously being detected as timing out by the
master, with a disconnection / reconnection loop.
2013-10-04 16:14:57 +02:00
antirez
c8c1006cf4 PSYNC: safer handling of PSYNC requests.
There was a bug that over-esteemed the amount of backlog available,
however this could only happen when a slave was asking for an offset
that was in the "future" compared to the master replication backlog.

Now this case is handled well and logged as an incident in the master
log file.
2013-10-04 12:27:40 +02:00
Maxim Zakharov
c713b1ebdd A mistype fixed 2013-09-03 15:15:34 +02:00
antirez
004f00bfa4 replicationFeedSlaves() func name typo: feedReplicationBacklogWithObject -> feedReplicationBacklog. 2013-08-12 12:38:52 +02:00
antirez
45d4e06e93 replicationFeedSlave() reworked for correctness and speed.
The previous code using a static buffer as an optimization was lame:

1) Premature optimization, actually it was *slower* than naive code
   because resulted into the creation / destruction of the object
   encapsulating the output buffer.
2) The code was very hard to test, since it was needed to have specific
   tests for command lines exceeding the size of the static buffer.
3) As a result of "2" the code was bugged as the current tests were not
   able to stress specific corner cases.

It was replaced with easy to understand code that is safer and faster.
2013-08-12 12:31:20 +02:00
antirez
3eab283b65 Fix a PSYNC bug caused by a variable name typo. 2013-08-12 11:51:03 +02:00
antirez
13f7ade551 Fix replicationFeedSlaves() off-by-one bug.
This fixes issue #1221.
2013-07-28 12:50:35 +02:00
antirez
3cb3714e99 Redis 2.7.101 (2.8 Release Candidate 1). 2013-07-18 11:26:53 +02:00
Ted Nyman
fe04710908 Make sure the log standardizes on 'timeout' 2013-07-12 23:12:27 +02:00
antirez
cdf801d92e Use getClientPeerId() for MONITOR implementation. 2013-07-11 17:09:18 +02:00
antirez
b5423b099c Fix old anetPeerToString() API call in replication.c 2013-07-11 17:07:38 +02:00
Geoff Garside
3570411def Update calls to anetPeerToString to include ip_len. 2013-07-11 17:05:03 +02:00
antirez
cdf79c063f Don't disconnect pre PSYNC replication clients for timeout.
Clients using SYNC to replicate are older implementations, such as
redis-cli --slave, and are not designed to acknowledge the master with
REPLCONF ACK commands, so we don't have any feedback and should not
disconnect them on timeout.
2013-06-26 15:24:30 +02:00
antirez
545fe0c318 Use the RSC to replicate EVALSHA unmodified.
This commit uses the Replication Script Cache in order to avoid
translating EVALSHA into EVAL whenever possible for both the AOF and
slaves.
2013-06-26 15:23:29 +02:00
antirez
9d894b1b8c Replication of scripts as EVALSHA: sha1 caching implemented.
This code is only responsible to take an LRU-evicted fixed length cache
of SHA1 that we are sure all the slaves received.

In this commit only the implementation is provided, but the Redis core
does not use it to actually send EVALSHA to slaves when possible.
2013-06-26 15:23:15 +02:00
antirez
f5275da6e9 Refresh good slaves count when setting slave state as online. 2013-05-31 07:19:42 +02:00
antirez
d0d67f8d42 min-slaves-to-write: don't accept writes with less than N replicas.
This feature allows the user to specify the minimum number of
connected replicas having a lag less or equal than the specified
amount of seconds for writes to be accepted.
2013-05-30 11:31:46 +02:00
antirez
308940aa2c Close connection with timedout slaves.
Now masters, using the time at which the last REPLCONF ACK was received,
are able to explicitly disconnect slaves that are no longer responding.

Previously the only chance was to see a very long output buffer, that
was highly suboptimal.
2013-05-27 11:43:17 +02:00
antirez
45e6a4023a Send ACK to master once every second.
ACKs can be also used as a base for synchronous replication. However in
that case they'll be explicitly requested by the master when the client
sends a request that needs to be replicated synchronously.
2013-05-27 11:43:14 +02:00
antirez
a74f8fe1ad Don't ACK the master after every command.
Sending an ACK is now moved into the replicationSendAck() function.
2013-05-27 11:43:10 +02:00
antirez
0000d5334d Make sure that REPLCONF ACK really has no return value. 2013-05-27 11:43:06 +02:00
antirez
1e77b77de4 REPLCONF ACK command.
This special command is used by the slave to inform the master the
amount of replication stream it currently consumed.

it does not return anything so that we not need to consume additional
bandwidth needed by the master to reply something.

The master can do a number of things knowing the amount of stream
processed, such as understanding the "lag" in bytes of the slave, verify
if a given command was already processed by the slave, and so forth.
2013-05-27 11:43:00 +02:00
antirez
d2a37badc2 Use GCC printf format attribute for redisLog().
This commit also fixes redisLog() statements producing warnings.
2013-02-27 12:47:16 +01:00
antirez
ff56772115 PSYNC: another change to unexpected reply from PSYNC. 2013-02-13 18:43:45 +01:00
antirez
964ee00de0 PSYNC: More robust handling of unexpected reply to PSYNC. 2013-02-13 18:34:18 +01:00
antirez
d96a66f531 Replication: more strict error checking for master PING reply. 2013-02-12 16:59:27 +01:00
antirez
31f0a6ec50 Replication: added new stats counting full and partial resynchronizations. 2013-02-12 16:26:42 +01:00
antirez
2b1398e870 PSYNC: debugging printf() calls are now logs at DEBUG level. 2013-02-12 12:58:03 +01:00
antirez
5a4ef3e5b6 Remove harmless warning in slaveTryPartialResynchronization(). 2013-02-12 12:57:59 +01:00
antirez
4fe7672dfc PSYNC: don't use the client buffer to send +CONTINUE and +FULLRESYNC.
When we are preparing an handshake with the slave we can't touch the
connection buffer as it'll be used to accumulate differences between
the sent RDB file and what arrives next from clients.

So in short we can't use addReply() family functions.

However we just use write(2) because we know that the socket buffer is
empty, since a prerequisite for SYNC to work is that the static buffer
and the output list are empty, and in general it is not expected that a
client SYNCs after doing some heavy I/O with the master.

However a short write connection is explicitly handled to avoid
fragility (we simply close the connection and the slave will retry).
2013-02-12 12:57:56 +01:00
antirez
8529dd218f SYNC not allowed with pending data on the static output buffer. 2013-02-12 12:57:52 +01:00
antirez
6726ea5b53 Log the unexpected string received in place of the SYNC payload length. 2013-02-12 12:57:48 +01:00
antirez
f5edd535d1 After SLAVEOF <newslave> don't allow chained slaves to PSYNC. 2013-02-12 12:57:44 +01:00
antirez
700e5eb4fc PSYNC: work in progress, preview #2, rebased to unstable. 2013-02-12 12:57:40 +01:00
antirez
4d8655cfd3 Use the new unified protocol to send SELECT to slaves.
SELECT was still transmitted to slaves using the inline protocol, that
is conceived mostly for humans to type into telnet sessions, and is
notably not understood by redis-cli --slave.

Now the new protocol is used instead.
2013-02-12 12:57:36 +01:00
antirez
01c21f9943 Use replicationFeedSlaves() to send PING to slaves.
A Redis master sends PING commands to slaves from time to time: doing
this ensures that even if absence of writes, the master->slave channel
remains active and the slave can feel the master presence, instead of
closing the connection for timeout.

This commit changes the way PINGs are sent to slaves in order to use the
standard interface used to replicate all the other commands, that is,
the function replicationFeedSlaves().

With this change the stream of commands sent to every slave is exactly
the same regardless of their exact state (Transferring RDB for first
synchronization or slave already online). With the previous
implementation the PING was only sent to online slaves, with the result
that the output stream from master to slaves was not identical for all
the slaves: this is a problem if we want to implement partial resyncs in
the future using a global replication stream offset.

TL;DR: this commit should not change the behaviour in practical terms,
but is just something in preparation for partial resynchronization
support.
2013-02-12 12:57:31 +01:00
antirez
5a35e485f9 Emit SELECT to slaves in a centralized way.
Before this commit every Redis slave had its own selected database ID
state. This was not actually useful as the emitted stream of commands
is identical for all the slaves.

Now the the currently selected database is a global state that is set to
-1 when a new slave is attached, in order to force the SELECT command to
be re-emitted for all the slaves.

This change is useful in order to implement replication partial
resynchronization in the future, as makes sure that the stream of
commands received by slaves, including SELECT commands, are exactly the
same for every slave connected, at any time.

In this way we could have a global offset that can identify a specific
piece of the master -> slaves stream of commands.
2013-02-12 12:57:26 +01:00
antirez
cc55a4525a Make all WATCHers dirty when the slave reloads the DB. 2013-02-08 10:27:21 +01:00
antirez
5f7dff4d16 TCP_NODELAY after SYNC: changes to the implementation. 2013-02-05 12:05:39 +01:00
charsyam
1d80acae54 Turn off TCP_NODELAY on the slave socket after SYNC.
Further details from @antirez:

It was reported by @StopForumSpam on Twitter that the Redis replication
link was strangely using multiple TCP packets for multiple commands.
This wastes a lot of bandwidth and is due to the TCP_NODELAY option we
enable on the socket after accepting a new connection.

However the master -> slave channel is a one-way channel since Redis
replication is asynchronous, so there is no point in trying to reduce
the latency, we should aim to reduce the bandwidth. For this reason this
commit introduces the ability to disable the nagle algorithm on the
socket after a successful SYNC.

This feature is off by default because the delay can be up to 40
milliseconds with normally configured Linux kernels.
2013-02-05 12:05:24 +01:00
guiquanz
1caf09399e Fixed many typos.
Conflicts fixed, mainly because 2.8 has no cluster support / files:
	00-RELEASENOTES
	src/cluster.c
	src/crc16.c
	src/redis-trib.rb
	src/redis.h
2013-01-19 11:03:19 +01:00
antirez
aa9497fdb3 Undo slave-master handshake when SLAVEOF sets a new slave.
Issue #828 shows how Redis was not correctly undoing a non-blocking
connection attempt with the previous master when the master was set to a
new address using the SLAVEOF command.

This was also a result of lack of refactoring, so now there is a
function to cancel the non blocking handshake with the master.
The new function is now used when SLAVEOF NO ONE is called or when
SLAVEOF is used to set the master to a different address.
2013-01-15 13:33:27 +01:00
antirez
cefff3a1b3 Better error reporting when fd event creation fails. 2013-01-03 14:32:32 +01:00
antirez
a6d117b6c0 serverCron() frequency is now a runtime parameter (was REDIS_HZ).
REDIS_HZ is the frequency our serverCron() function is called with.
A more frequent call to this function results into less latency when the
server is trying to handle very expansive background operations like
mass expires of a lot of keys at the same time.

Redis 2.4 used to have an HZ of 10. This was good enough with almost
every setup, but the incremental key expiration algorithm was working a
bit better under *extreme* pressure when HZ was set to 100 for Redis
2.6.

However for most users a latency spike of 30 milliseconds when million
of keys are expiring at the same time is acceptable, on the other hand a
default HZ of 100 in Redis 2.6 was causing idle instances to use some
CPU time compared to Redis 2.4. The CPU usage was in the order of 0.3%
for an idle instance, however this is a shame as more energy is consumed
by the server, if not important resources.

This commit introduces HZ as a runtime parameter, that can be queried by
INFO or CONFIG GET, and can be modified with CONFIG SET. At the same
time the default frequency is set back to 10.

In this way we default to a sane value of 10, but allows users to
easily switch to values up to 500 for near real-time applications if
needed and if they are willing to pay this small CPU usage penalty.
2012-12-14 17:20:21 +01:00
antirez
8ddb23b90c BSD license added to every C source and header file. 2012-11-08 18:34:04 +01:00
antirez
d775b35715 Unix socket clients properly displayed in MONITOR and CLIENT LIST.
This also fixes issue #745.
2012-11-01 22:12:50 +01:00
antirez
0c19880ce6 "Timeout receiving bulk data" error message modified.
The new message now contains an hint about modifying the repl-timeout
configuration directive if the problem persists.

This should normally not be needed, because while the master generates
the RDB file it makes sure to send newlines to the replication channel
to prevent timeouts. However there are times when masters running on
very slow systems can completely stop for seconds during the RDB saving
process. In such a case enlarging the timeout value can fix the problem.

See issue #695 for an example of this problem in an EC2 deployment.
2012-10-04 11:49:17 +02:00
antirez
8b6b1b27cc Fix compilation on FreeBSD. Thanks to @koobs on twitter. 2012-09-17 12:45:57 +02:00
Saj Goonatilleke
0671d88cab Bug fix: slaves being pinged every second
REDIS_REPL_PING_SLAVE_PERIOD controls how often the master should
transmit a heartbeat (PING) to its slaves.  This period, which defaults
to 10, is measured in seconds.

Redis 2.4 masters used to ping their slaves every ten seconds, just like
it says on the tin.

The Redis 2.6 masters I have been experimenting with, on the other hand,
ping their slaves *every second*.  (master_last_io_seconds_ago never
approaches 10.)  I think the ping period was inadvertently slashed to
one-tenth of its nominal value around the time REDIS_HZ was introduced.
This commit reintroduces correct ping schedule behaviour.
2012-09-05 16:01:01 +02:00