If we can't reconfigure a slave in time during failover, go forward as
anyway the slave will be fixed by Sentinels in the future, once they
detect it is misconfigured.
Otherwise a failover in progress may never terminate if for some reason
the slave is uncapable to sync with the master while at the same time
it is not disconnected.
Some inline test moved into server_is_up procedure.
Also find_available_port was moved into util since it is going
to be used for the Sentinel test as well.
The code tried to obtain the configuration file absolute path after
processing the configuration file. However if config file was a relative
path and a "dir" statement was processed reading the config, the absolute
path obtained was wrong.
With this fix the absolute path is obtained before processing the
configuration while the server is still in the original directory where
it was executed.
Now it logs the file name if it is not accessible. Also there is a
different error for the missing config file case, and for the non
writable file case.
server.unixtime and server.mstime are cached less precise timestamps
that we use every time we don't need an accurate time representation and
a syscall would be too slow for the number of calls we require.
Such an example is the initialization and update process of the last
interaction time with the client, that is used for timeouts.
However rdbLoad() can take some time to load the DB, but at the same
time it did not updated the time during DB loading. This resulted in the
bug described in issue #1535, where in the replication process the slave
loads the DB, creates the redisClient representation of its master, but
the timestamp is so old that the master, under certain conditions, is
sensed as already "timed out".
Thanks to @yoav-steinberg and Redis Labs Inc for the bug report and
analysis.
This commit fixes a serious Lua scripting replication issue, described
by Github issue #1549. The root cause of the problem is that scripts
were put inside the script cache, assuming that slaves and AOF already
contained it, even if the scripts sometimes produced no changes in the
data set, and were not actaully propagated to AOF/slaves.
Example:
eval "if tonumber(KEYS[1]) > 0 then redis.call('incr', 'x') end" 1 0
Then:
evalsha <sha1 step 1 script> 1 0
At this step sha1 of the script is added to the replication script cache
(the script is marked as known to the slaves) and EVALSHA command is
transformed to EVAL. However it is not dirty (there is no changes to db),
so it is not propagated to the slaves. Then the script is called again:
evalsha <sha1 step 1 script> 1 1
At this step master checks that the script already exists in the
replication script cache and doesn't transform it to EVAL command. It is
dirty and propagated to the slaves, but they fail to evaluate the script
as they don't have it in the script cache.
The fix is trivial and just uses the new API to force the propagation of
the executed command regardless of the dirty state of the data set.
Thank you to @minus-infinity on Github for finding the issue,
understanding the root cause, and fixing it.
A system similar to the RDB write error handling is used, in which when
we can't write to the AOF file, writes are no longer accepted until we
are able to write again.
For fsync == always we still abort on errors since there is currently no
easy way to avoid replying with success to the user otherwise, and this
would violate the contract with the user of only acknowledging data
already secured on disk.
Sometime an osx master with a Linux server over a slow link caused
a strange error where osx called the writable function for
the socket but actually apparently there was no room in the socket
buffer to accept the write: write(2) call returned an EAGAIN error,
that was not checked, so we considered write(2) == 0 always as a connection
reset, which was unfortunate since the bulk transfer has to start again.
Also more errors are logged with the WARNING level in the same code path
now.
The define is now used in other parts of Redis 2.8 tree instead of long
long.
A nice side effect is that now 2.8 and unstable sentinel.c files are
identical as it should be.
Keys expiring in the middle of the execution of Lua scripts are to
create inconsistencies in masters and / or AOF files. See the following
example:
if redis.call("exists",KEYS[1]) == 1
then
redis.call("incr","mycounter")
end
if redis.call("exists",KEYS[1]) == 1
then
return redis.call("incr","mycounter")
end
The script executes two times the same *if key exists then incrementcounter*
logic. However the two executions will work differently in the master and
the slaves, provided some unlucky timing happens.
In the master the first time the key may still exist, while the second time
the key may no longer exist. This will result in the key incremented just one
time. However as a side effect the master will generate a synthetic
`DEL` command in the replication channel in order to force the slaves to
expire the key (given that key expiration is master-driven).
When the same script will run in the slave, the key will no longer be
there, so the script will not increment the key.
The key idea used to implement the expire-at-first-lookup semantics was
provided by Marc Gravell.
server.lua_time_start is expressed in milliseconds. Use mstime_t instead
of long long, and populate it with mstime() instead of ustime()/1000.
Functionally identical but more natural.
The Redis test uses a server-clients model in order to parallelize the
execution of different tests. However in recent versions of osx not
setting the channel to a binary encoding caused issues even if AFAIK no
binary data is really sent via this channel. However now the channels
are deliberately set to a binary encoding and this solves the issue.
The exact issue was the test not terminating and giving the impression
of running forever, since test clients or servers were unable to
exchange the messages to continue.
In high RPS environments, the default listen backlog is not sufficient, so
giving users the power to configure it is the right approach, especially
since it requires only minor modifications to the code.
Incremental flushing in rio.c is only used to avoid huge kernel buffers
synched to slow disks creating big latency spikes, so this fix has no
durability implications, however it is certainly more correct to make
sure that the FILE buffers are flushed to the kernel before calling
fsync on the file descriptor.
Thanks to Li Shao Kai for reporting this issue in the Redis mailing
list.
The new command allows to change master-specific configurations
at runtime. All the settable parameters can be retrivied via the
SENTINEL MASTER command, so there is no equivalent "GET" command.
The claim about unlinking the instance from the connected hash tables
was the opposite of the reality. Also the current actual behavior is
safer in most cases, so it is better to manually unlink when needed.