This should improve things in two ways:
1) Prevent timeouts caused by the execution of long commands.
2) Improve detection of real connection errors.
This is mostly effective only on Linux because of the bogus default
keepalive settings. In Linux we have OS-specific calls to set the
keepalive interval to reasonable values.
This cased a segfault in some Linux system and was GCC-specific.
Commit modified by @antirez:
1) Stripped away the part to set the proc title via config for now.
2) Handle initialization of setproctitle only when the replacement
is used.
3) Don't require GCC now that the attribute constructor is no
longer used.
This commit allows Redis to set a process name that includes the binding
address and the port number in order to make operations simpler.
Redis children processes doing AOF rewrites or RDB saving change the
name into redis-aof-rewrite and redis-rdb-bgsave respectively.
This in general makes harder to kill the wrong process because of an
error and makes simpler to identify saving children.
This feature was suggested by Arnaud GRANAL in the Redis Google Group,
Arnaud also pointed me to the setproctitle.c implementation includeed in
this commit.
This feature should work on all the Linux, OSX, and all the three major
BSD systems.
When we are preparing an handshake with the slave we can't touch the
connection buffer as it'll be used to accumulate differences between
the sent RDB file and what arrives next from clients.
So in short we can't use addReply() family functions.
However we just use write(2) because we know that the socket buffer is
empty, since a prerequisite for SYNC to work is that the static buffer
and the output list are empty, and in general it is not expected that a
client SYNCs after doing some heavy I/O with the master.
However a short write connection is explicitly handled to avoid
fragility (we simply close the connection and the slave will retry).
SELECT was still transmitted to slaves using the inline protocol, that
is conceived mostly for humans to type into telnet sessions, and is
notably not understood by redis-cli --slave.
Now the new protocol is used instead.
A Redis master sends PING commands to slaves from time to time: doing
this ensures that even if absence of writes, the master->slave channel
remains active and the slave can feel the master presence, instead of
closing the connection for timeout.
This commit changes the way PINGs are sent to slaves in order to use the
standard interface used to replicate all the other commands, that is,
the function replicationFeedSlaves().
With this change the stream of commands sent to every slave is exactly
the same regardless of their exact state (Transferring RDB for first
synchronization or slave already online). With the previous
implementation the PING was only sent to online slaves, with the result
that the output stream from master to slaves was not identical for all
the slaves: this is a problem if we want to implement partial resyncs in
the future using a global replication stream offset.
TL;DR: this commit should not change the behaviour in practical terms,
but is just something in preparation for partial resynchronization
support.
Before this commit every Redis slave had its own selected database ID
state. This was not actually useful as the emitted stream of commands
is identical for all the slaves.
Now the the currently selected database is a global state that is set to
-1 when a new slave is attached, in order to force the SELECT command to
be re-emitted for all the slaves.
This change is useful in order to implement replication partial
resynchronization in the future, as makes sure that the stream of
commands received by slaves, including SELECT commands, are exactly the
same for every slave connected, at any time.
In this way we could have a global offset that can identify a specific
piece of the master -> slaves stream of commands.
Further details from @antirez:
It was reported by @StopForumSpam on Twitter that the Redis replication
link was strangely using multiple TCP packets for multiple commands.
This wastes a lot of bandwidth and is due to the TCP_NODELAY option we
enable on the socket after accepting a new connection.
However the master -> slave channel is a one-way channel since Redis
replication is asynchronous, so there is no point in trying to reduce
the latency, we should aim to reduce the bandwidth. For this reason this
commit introduces the ability to disable the nagle algorithm on the
socket after a successful SYNC.
This feature is off by default because the delay can be up to 40
milliseconds with normally configured Linux kernels.