3639 Commits

Author SHA1 Message Date
antirez
96973a7c33 AOF write error: retry with a frequency of 1 hz. 2014-02-12 16:57:17 +01:00
antirez
fadbbdd3f4 AOF: don't abort on write errors unless fsync is 'always'.
A system similar to the RDB write error handling is used, in which when
we can't write to the AOF file, writes are no longer accepted until we
are able to write again.

For fsync == always we still abort on errors since there is currently no
easy way to avoid replying with success to the user otherwise, and this
would violate the contract with the user of only acknowledging data
already secured on disk.
2014-02-12 16:57:13 +01:00
antirez
688d32e16b Don't count time to feed MONITORs in SLOWLOG. 2014-02-07 18:29:26 +01:00
antirez
3e4968339b Sentinel: allow SHUTDOWN command in Sentinel mode. 2014-02-07 11:22:30 +01:00
antirez
301a0cfc69 Check for EAGAIN in sendBulkToSlave().
Sometime an osx master with a Linux server over a slow link caused
a strange error where osx called the writable function for
the socket but actually apparently there was no room in the socket
buffer to accept the write: write(2) call returned an EAGAIN error,
that was not checked, so we considered write(2) == 0 always as a connection
reset, which was unfortunate since the bulk transfer has to start again.

Also more errors are logged with the WARNING level in the same code path
now.
2014-02-05 16:41:04 +01:00
antirez
4e809a9a19 Redis 2.8.5. 2.8.5 2014-02-04 11:17:21 +01:00
antirez
ddcf160309 Move mstime_t define outside sentinel.c.
The define is now used in other parts of Redis 2.8 tree instead of long
long.

A nice side effect is that now 2.8 and unstable sentinel.c files are
identical as it should be.
2014-02-03 16:34:46 +01:00
antirez
c5bc592650 Scripting: expire keys in scripts only at first access.
Keys expiring in the middle of the execution of Lua scripts are to
create inconsistencies in masters and / or AOF files. See the following
example:

    if redis.call("exists",KEYS[1]) == 1
    then
        redis.call("incr","mycounter")
    end

    if redis.call("exists",KEYS[1]) == 1
    then
        return redis.call("incr","mycounter")
    end

The script executes two times the same *if key exists then incrementcounter*
logic. However the two executions will work differently in the master and
the slaves, provided some unlucky timing happens.

In the master the first time the key may still exist, while the second time
the key may no longer exist. This will result in the key incremented just one
time. However as a side effect the master will generate a synthetic
`DEL` command in the replication channel in order to force the slaves to
expire the key (given that key expiration is master-driven).

When the same script will run in the slave, the key will no longer be
there, so the script will not increment the key.

The key idea used to implement the expire-at-first-lookup semantics was
provided by Marc Gravell.
2014-02-03 16:29:25 +01:00
antirez
5201ca0ca1 Allow CONFIG and SHUTDOWN while in stale-slave state. 2014-02-03 15:51:07 +01:00
antirez
3da5cbe5bb Scripting: use mstime() and mstime_t for lua_time_start.
server.lua_time_start is expressed in milliseconds. Use mstime_t instead
of long long, and populate it with mstime() instead of ustime()/1000.

Functionally identical but more natural.
2014-02-03 15:46:47 +01:00
PatrickJS
0be31e2d22 update copyright year 2014-02-03 11:19:25 +01:00
antirez
66304fb122 Test: fixed osx msg passing issue in testing framework.
The Redis test uses a server-clients model in order to parallelize the
execution of different tests. However in recent versions of osx not
setting the channel to a binary encoding caused issues even if AFAIK no
binary data is really sent via this channel. However now the channels
are deliberately set to a binary encoding and this solves the issue.

The exact issue was the test not terminating and giving the impression
of running forever, since test clients or servers were unable to
exchange the messages to continue.
2014-01-31 16:25:13 +01:00
antirez
1406112ec4 Redis.conf comment about tcp-backlog option improved. 2014-01-31 15:03:25 +01:00
antirez
917b851491 Option "backlog" renamed "tcp-backlog".
This is especially important since we already have a concept of backlog
(the replication backlog).
2014-01-31 15:03:22 +01:00
Nenad Merdanovic
8dda9dbef0 Add support for listen(2) backlog definition
In high RPS environments, the default listen backlog is not sufficient, so
giving users the power to configure it is the right approach, especially
since it requires only minor modifications to the code.
2014-01-31 15:03:19 +01:00
antirez
f0652c37a5 Sentinel: check arity for SENTINEL MASTER command.
This fixes issue #1530.
2014-01-31 10:14:07 +01:00
antirez
a2c9d38a9f SENTINEL SET master quorum implemented. 2014-01-28 11:23:51 +01:00
antirez
89a731b3aa Fixed inverted if condition in MISCONF error code path. 2014-01-28 10:10:56 +01:00
antirez
ecfefde760 Don't log MONITOR clients as disconnecting slaves. 2014-01-25 11:54:02 +01:00
antirez
43503ae5d6 redis-cli --help output improved with --scan and periods. 2014-01-22 12:08:08 +01:00
antirez
b9c84d4aef redis-cli: support for --scan option. 2014-01-22 12:08:04 +01:00
antirez
248e916550 Use fflush() before fsync() in rio.c.
Incremental flushing in rio.c is only used to avoid huge kernel buffers
synched to slow disks creating big latency spikes, so this fix has no
durability implications, however it is certainly more correct to make
sure that the FILE buffers are flushed to the kernel before calling
fsync on the file descriptor.

Thanks to Li Shao Kai for reporting this issue in the Redis mailing
list.
2014-01-22 09:56:35 +01:00
antirez
d401022d14 Fix typo in aofRewriteBufferAppend() comment. 2014-01-14 15:38:26 +01:00
antirez
4ea91aadfb Set REDIS_AOF_REWRITE_MIN_SIZE to 64mb.
64mb is the default value in redis.conf. For some reason instead the
hard-coded default was 1mb that is too small.
2014-01-14 11:28:18 +01:00
antirez
71f7eb2590 Redis 2.8.4. 2.8.4 2014-01-13 17:09:58 +01:00
antirez
df00b6ac05 SENTINEL SET: error on bad option name + flush config on error. 2014-01-13 16:39:56 +01:00
antirez
b8db8a0c48 SENTINEL SET implemented.
The new command allows to change master-specific configurations
at runtime. All the settable parameters can be retrivied via the
SENTINEL MASTER command, so there is no equivalent "GET" command.
2014-01-13 16:39:51 +01:00
antirez
8ed0b49525 Sentinel: fix wrong arity error message. 2014-01-13 16:39:47 +01:00
antirez
560e548dc6 Sentinel: SENTINEL REMOVE command added.
The command totally removes a monitored master.
2014-01-13 16:39:44 +01:00
antirez
0ca750d94a Sentinel: releaseSentinelRedisInstance() top comment fixed.
The claim about unlinking the instance from the connected hash tables
was the opposite of the reality. Also the current actual behavior is
safer in most cases, so it is better to manually unlink when needed.
2014-01-13 16:39:40 +01:00
antirez
1f9580c125 Sentinel: flush config on disk when new master is added. 2014-01-13 16:39:36 +01:00
antirez
bb207007ca anetResolveIP() prototype added to anet.h. 2014-01-13 16:39:31 +01:00
antirez
b7dc3204a6 Sentinel: SENTINEL MONITOR command implemented.
It allows to add new masters to monitor at runtime.
2014-01-13 16:39:27 +01:00
antirez
2e0ba7bb5b anetResolveIP() added to anet.c.
The new function is used when we want to normalize an IP address without
performing a DNS lookup if the string to resolve is not a valid IP.

This is useful every time only IPs are valid inputs or when we want to
skip DNS resolution that is slow during runtime operations if we are
required to block.
2014-01-13 16:39:24 +01:00
antirez
52cf0975c7 Sentinel: added SENTINEL MASTER <name> command.
With SENTINEL MASTERS it was already possible to list all the configured
masters, but not a specific one.
2014-01-13 16:39:18 +01:00
antirez
6bcc370c9d Add all the configurable fields to addReplySentinelRedisInstance().
Note: the auth password with the master is voluntarily not exposed.
2014-01-13 16:39:15 +01:00
antirez
ea2bffa030 Trip comment to 80 cols in SentinelCommand(). 2014-01-13 16:39:11 +01:00
antirez
bca65866ff Test: regression for issues #1483. 2014-01-09 11:19:11 +01:00
antirez
e4c51759f5 Fix RESTORE ttl handling in 32 bit archs.
long was used instead of long long in order to handle a 64 bit
resolution millisecond timestamp.

This fixes issue #1483.
2014-01-09 11:12:53 +01:00
antirez
51f71e2d8e Fix keyspace events flags-to-string conversion.
Fixes issue #1491 on Github.
2014-01-08 17:18:00 +01:00
antirez
05d219a6fa Test: stress events flags to/from string conversion. 2014-01-08 17:16:09 +01:00
antirez
2a1a31ca9d Don't send REPLCONF ACK to old masters.
Masters not understanding REPLCONF ACK will reply with errors to our
requests causing a number of possible issues.

This commit detects a global replication offest set to -1 at the end of
the replication, and marks the client representing the master with the
REDIS_PRE_PSYNC flag.

Note that this flag was called REDIS_PRE_PSYNC_SLAVE but now it is just
REDIS_PRE_PSYNC as it is used for both slaves and masters starting with
this commit.

This commit fixes issue #1488.
2014-01-08 14:27:49 +01:00
antirez
418d3d358a Clarify a comment in slaveTryPartialResynchronization(). 2014-01-08 14:11:02 +01:00
antirez
0a1a236e3e Log disconnection with slave only when ip:port is available. 2013-12-25 18:41:10 +01:00
antirez
27d06111db anetPeerToString / SockName: port can be NULL on errors too. 2013-12-25 18:39:49 +01:00
antirez
5b7c16137d anetTcpGenericConnect() bug introduced in 9d19977 fixed.
Durign a refactoring I mispelled _port for port.
This is one of the reasons I never used _varname myself.
2013-12-25 18:38:33 +01:00
antirez
d07d4a876c Remove useless goto from anetTcpGenericConnect(). 2013-12-25 18:24:04 +01:00
antirez
9d1997706c anetTcpGenericConnect() code improved + 1 bug fix.
Now the socket is closed if anetNonBlock() fails, and in general the
code structure makes it harder to introduce this kind of bugs in the
future.

Reference: pull request #1059.
2013-12-25 18:16:46 +01:00
antirez
4ad219adc8 Fix CONFIG REWRITE handling of unknown options.
There were two problems with the implementation.

1) "save" was not correctly processed when no save point was configured,
   as reported in issue #1416.
2) The way the code checked if an option existed in the "processed"
   dictionary was wrong, as we add the element with as a key associated
   with a NULL value, so dictFetchValue() can't be used to check for
   existance, but dictFind() must be used, that returns NULL only if the
   entry does not exist at all.
2013-12-23 12:50:52 +01:00
antirez
c6db326d1d Configuring port to 0 disables IP socket as specified.
This was no longer the case with 2.8 becuase of a bug introduced with
the IPv6 support. Now it is fixed.

This fixes issue #1287 and #1477.
2013-12-23 11:34:15 +01:00