118 Commits

Author SHA1 Message Date
antirez
89f82a25e1 Diskless replication: EOF:<mark> streaming support slave side. 2014-10-29 14:33:49 +01:00
antirez
583a762cd9 Diskless replication: redis.conf and CONFIG SET/GET support. 2014-10-29 14:33:49 +01:00
antirez
31b0e13268 Diskless replication: trigger a BGSAVE after a config change.
If we turn from diskless to disk-based replication via CONFIG SET, we
need a way to start a BGSAVE if there are slaves alerady waiting for a
BGSAVE to start. Normally with disk-based replication we do it as soon
as the previous child exits, but when there is a configuration change
via CONFIG SET, we may have slaves in WAIT_BGSAVE_START state without
an RDB background process currently active.
2014-10-29 14:33:49 +01:00
antirez
d403e4ae63 Diskless replication flag renamed repl_diskless -> repl_diskless_sync. 2014-10-29 14:33:49 +01:00
antirez
2f0b58e4ed Diskless replication: trigger diskless RDB transfer if needed. 2014-10-29 14:33:49 +01:00
antirez
82bfae5b70 Diskless replication: handle putting the slave online. 2014-10-29 14:33:49 +01:00
antirez
1b4cadb664 Diskless replication: RDB -> slaves transfer draft implementation. 2014-10-29 14:33:49 +01:00
antirez
aef4c60c78 Add some comments in syncCommand() to clarify RDB target. 2014-10-29 14:33:49 +01:00
antirez
ff8a3baaf7 Replication: better way to send a preamble before RDB payload.
During the replication full resynchronization process, the RDB file is
transfered from the master to the slave. However there is a short
preamble to send, that is currently just the bulk payload length of the
file in the usual Redis form $..length..<CR><LF>.

This preamble used to be sent with a direct write call, assuming that
there was alway room in the socket output buffer to hold the few bytes
needed, however this does not scale in case we'll need to send more
stuff, and is not very robust code in general.

This commit introduces a more general mechanism to send a preamble up to
2GB in size (the max length of an sds string) in a non blocking way.
2014-10-29 14:33:30 +01:00
Aaron Rutkovsky
3101760937 Fix typos
Closes #1513
2014-10-06 09:59:02 +02:00
Jan-Erik Rediger
575e293ddd Fix typo: ad -> and
Closes #1537
2014-10-06 09:57:01 +02:00
antirez
60ff8095d6 No more trailing spaces in Redis source code. 2014-06-26 18:52:16 +02:00
antirez
8c460a2872 ROLE command: array len fixed for slave output. 2014-06-21 15:29:11 +02:00
antirez
8060de98f8 ROLE output improved for slaves.
Info about the replication state with the master added.
2014-06-21 15:27:12 +02:00
antirez
41a15205b6 ROLE command added.
The new ROLE command is designed in order to provide a client with
informations about the replication in a fast and easy to use way
compared to the INFO command where the same information is also
available.
2014-06-21 15:27:12 +02:00
antirez
d8d415e717 CLIENT LIST speedup via peerid caching + smart allocation.
This commit adds peer ID caching in the client structure plus an API
change and the use of sdsMakeRoomFor() in order to improve the
reallocation pattern to generate the CLIENT LIST output.

Both the changes account for a very significant speedup.
2014-06-21 15:27:08 +02:00
antirez
301a0cfc69 Check for EAGAIN in sendBulkToSlave().
Sometime an osx master with a Linux server over a slow link caused
a strange error where osx called the writable function for
the socket but actually apparently there was no room in the socket
buffer to accept the write: write(2) call returned an EAGAIN error,
that was not checked, so we considered write(2) == 0 always as a connection
reset, which was unfortunate since the bulk transfer has to start again.

Also more errors are logged with the WARNING level in the same code path
now.
2014-02-05 16:41:04 +01:00
antirez
2a1a31ca9d Don't send REPLCONF ACK to old masters.
Masters not understanding REPLCONF ACK will reply with errors to our
requests causing a number of possible issues.

This commit detects a global replication offest set to -1 at the end of
the replication, and marks the client representing the master with the
REDIS_PRE_PSYNC flag.

Note that this flag was called REDIS_PRE_PSYNC_SLAVE but now it is just
REDIS_PRE_PSYNC as it is used for both slaves and masters starting with
this commit.

This commit fixes issue #1488.
2014-01-08 14:27:49 +01:00
antirez
418d3d358a Clarify a comment in slaveTryPartialResynchronization(). 2014-01-08 14:11:02 +01:00
antirez
4456ee1173 Make new masters inherit replication offsets.
Currently replication offsets could be used into a limited way in order
to understand, out of a set of slaves, what is the one with the most
updated data. For example this comparison is possible of N slaves
were replicating all with the same master.

However the replication offset was not transferred from master to slaves
(that are later promoted as masters) in any way, so for instance if
there were three instances A, B, C, with A master and B and C
replication from A, the following could happen:

C disconnects from A.
B is turned into master.
A is switched to master of B.
B receives some write.

In this context there was no way to compare the offset of A and C,
because B would use its own local master replication offset as
replication offset to initialize the replication with A.

With this commit what happens is that when B is turned into master it
inherits the replication offset from A, making A and C comparable.
In the above case assuming no inconsistencies are created during the
disconnection and failover process, A will show to have a replication
offset greater than C.

Note that this does not mean offsets are always comparable to understand
what is, in a set of instances, since in more complex examples the
replica with the higher replication offset could be partitioned away
when picking the instance to elect as new master. However this in
general improves the ability of a system to try to pick a good replica
to promote to master.
2013-12-22 11:54:10 +01:00
antirez
563d6b3f98 Slaves heartbeats during sync improved.
The previous fix for false positive timeout detected by master was not
complete. There is another blocking stage while loading data for the
first synchronization with the master, that is, flushing away the
current data from the DB memory.

This commit uses the newly introduced dict.c callback in order to make
some incremental work (to send "\n" heartbeats to the master) while
flushing the old data from memory.

It is hard to write a regression test for this issue unfortunately. More
support for debugging in the Redis core would be needed in terms of
functionalities to simulate a slow DB loading / deletion.
2013-12-10 18:42:22 +01:00
antirez
b6610a569d dict.c: added optional callback to dictEmpty().
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).
2013-12-10 18:18:36 +01:00
antirez
26cf5c8ac6 Log empty DB + Loading data into two separated messages. 2013-12-10 17:51:16 +01:00
antirez
7f6743a581 Fixed grammar: before H the article is a, not an. 2013-12-05 16:37:21 +01:00
antirez
50d140e90b SLAVEOF command refactored into a proper API.
We now have replicationSetMaster() and replicationUnsetMaster() that can
be called in other contexts (for instance Redis Cluster).
2013-11-28 16:19:16 +01:00
antirez
98682cd178 Log to what master a slave is going to connect to. 2013-11-11 09:25:41 +01:00
antirez
df0c96002d Replication: install the write handler when reusing a cached master.
Sometimes when we resurrect a cached master after a successful partial
resynchronization attempt, there is pending data in the output buffers
of the client structure representing the master (likely REPLCONF ACK
commands).

If we don't reinstall the write handler, it will never be installed
again by addReply*() family functions as they'll assume that if there is
already data pending, the write handler is already installed.

This bug caused some slaves after a successful partial sync to never
send REPLCONF ACK, and continuously being detected as timing out by the
master, with a disconnection / reconnection loop.
2013-10-04 16:14:57 +02:00
antirez
c8c1006cf4 PSYNC: safer handling of PSYNC requests.
There was a bug that over-esteemed the amount of backlog available,
however this could only happen when a slave was asking for an offset
that was in the "future" compared to the master replication backlog.

Now this case is handled well and logged as an incident in the master
log file.
2013-10-04 12:27:40 +02:00
Maxim Zakharov
c713b1ebdd A mistype fixed 2013-09-03 15:15:34 +02:00
antirez
004f00bfa4 replicationFeedSlaves() func name typo: feedReplicationBacklogWithObject -> feedReplicationBacklog. 2013-08-12 12:38:52 +02:00
antirez
45d4e06e93 replicationFeedSlave() reworked for correctness and speed.
The previous code using a static buffer as an optimization was lame:

1) Premature optimization, actually it was *slower* than naive code
   because resulted into the creation / destruction of the object
   encapsulating the output buffer.
2) The code was very hard to test, since it was needed to have specific
   tests for command lines exceeding the size of the static buffer.
3) As a result of "2" the code was bugged as the current tests were not
   able to stress specific corner cases.

It was replaced with easy to understand code that is safer and faster.
2013-08-12 12:31:20 +02:00
antirez
3eab283b65 Fix a PSYNC bug caused by a variable name typo. 2013-08-12 11:51:03 +02:00
antirez
13f7ade551 Fix replicationFeedSlaves() off-by-one bug.
This fixes issue #1221.
2013-07-28 12:50:35 +02:00
antirez
3cb3714e99 Redis 2.7.101 (2.8 Release Candidate 1). 2013-07-18 11:26:53 +02:00
Ted Nyman
fe04710908 Make sure the log standardizes on 'timeout' 2013-07-12 23:12:27 +02:00
antirez
cdf801d92e Use getClientPeerId() for MONITOR implementation. 2013-07-11 17:09:18 +02:00
antirez
b5423b099c Fix old anetPeerToString() API call in replication.c 2013-07-11 17:07:38 +02:00
Geoff Garside
3570411def Update calls to anetPeerToString to include ip_len. 2013-07-11 17:05:03 +02:00
antirez
cdf79c063f Don't disconnect pre PSYNC replication clients for timeout.
Clients using SYNC to replicate are older implementations, such as
redis-cli --slave, and are not designed to acknowledge the master with
REPLCONF ACK commands, so we don't have any feedback and should not
disconnect them on timeout.
2013-06-26 15:24:30 +02:00
antirez
545fe0c318 Use the RSC to replicate EVALSHA unmodified.
This commit uses the Replication Script Cache in order to avoid
translating EVALSHA into EVAL whenever possible for both the AOF and
slaves.
2013-06-26 15:23:29 +02:00
antirez
9d894b1b8c Replication of scripts as EVALSHA: sha1 caching implemented.
This code is only responsible to take an LRU-evicted fixed length cache
of SHA1 that we are sure all the slaves received.

In this commit only the implementation is provided, but the Redis core
does not use it to actually send EVALSHA to slaves when possible.
2013-06-26 15:23:15 +02:00
antirez
f5275da6e9 Refresh good slaves count when setting slave state as online. 2013-05-31 07:19:42 +02:00
antirez
d0d67f8d42 min-slaves-to-write: don't accept writes with less than N replicas.
This feature allows the user to specify the minimum number of
connected replicas having a lag less or equal than the specified
amount of seconds for writes to be accepted.
2013-05-30 11:31:46 +02:00
antirez
308940aa2c Close connection with timedout slaves.
Now masters, using the time at which the last REPLCONF ACK was received,
are able to explicitly disconnect slaves that are no longer responding.

Previously the only chance was to see a very long output buffer, that
was highly suboptimal.
2013-05-27 11:43:17 +02:00
antirez
45e6a4023a Send ACK to master once every second.
ACKs can be also used as a base for synchronous replication. However in
that case they'll be explicitly requested by the master when the client
sends a request that needs to be replicated synchronously.
2013-05-27 11:43:14 +02:00
antirez
a74f8fe1ad Don't ACK the master after every command.
Sending an ACK is now moved into the replicationSendAck() function.
2013-05-27 11:43:10 +02:00
antirez
0000d5334d Make sure that REPLCONF ACK really has no return value. 2013-05-27 11:43:06 +02:00
antirez
1e77b77de4 REPLCONF ACK command.
This special command is used by the slave to inform the master the
amount of replication stream it currently consumed.

it does not return anything so that we not need to consume additional
bandwidth needed by the master to reply something.

The master can do a number of things knowing the amount of stream
processed, such as understanding the "lag" in bytes of the slave, verify
if a given command was already processed by the slave, and so forth.
2013-05-27 11:43:00 +02:00
antirez
d2a37badc2 Use GCC printf format attribute for redisLog().
This commit also fixes redisLog() statements producing warnings.
2013-02-27 12:47:16 +01:00
antirez
ff56772115 PSYNC: another change to unexpected reply from PSYNC. 2013-02-13 18:43:45 +01:00