2987 Commits

Author SHA1 Message Date
antirez
3119f4f694 Cluster: better timeout and retry time for failover.
When node-timeout is too small, in the order of a few milliseconds,
there is no way the voting process can terminate during that time, so we
set a lower limit for the failover timeout of two seconds.

The retry time is set to two times the failover timeout time, so it is
at least 4 seconds.
2014-03-11 11:10:09 +01:00
Matt Stancliff
14f77b343a Fix key extraction for z{union,inter}store
The previous implementation wasn't taking into account
the storage key in position 1 being a requirement (it
was only counting the source keys in positions 3 to N).

Fixes antirez/redis#1581
2014-03-11 11:10:09 +01:00
antirez
afe28cfd75 Cluster: fix conditional generating TRYAGAIN error. 2014-03-11 11:10:09 +01:00
antirez
aa5898f53e Redis Cluster: support for multi-key operations. 2014-03-11 11:10:09 +01:00
Matt Stancliff
c0915ad1a0 Reset op_sec_last_sample_ops when reset requested
This value needs to be set to zero (in addition to
stat_numcommands) or else people may see
a negative operations per second count after they
run CONFIG RESETSTAT.

Fixes antirez/redis#1577
2014-03-11 11:10:09 +01:00
Matt Stancliff
4b3c87a027 Remove redundant IP length definition
REDIS_CLUSTER_IPLEN had the same value as
REDIS_IP_STR_LEN.  They were both #define'd
to the same INET6_ADDRSTRLEN.
2014-03-11 11:10:09 +01:00
Matt Stancliff
7c8964a8cf Remove some redundant code
Function nodeIp2String in cluster.c is exactly
anetPeerToString with a pre-extracted fd.
2014-03-11 11:09:37 +01:00
Matt Stancliff
7c359449d5 Fix return value check for anetTcpAccept
anetTcpAccept returns ANET_ERR, not AE_ERR.

This isn't a physical error since both ANET_ERR
and AE_ERR are -1, but better to be consistent.
2014-03-11 11:09:37 +01:00
Jan-Erik Rediger
6766fc561e Small typo fixed 2014-03-11 11:09:37 +01:00
Matt Stancliff
9a7cf31960 Bind source address for cluster communication
The first address specified as a bind parameter
(server.bindaddr[0]) gets used as the source IP
for cluster communication.

If no bind address is specified by the user, the
behavior is unchanged.

This patch allows multiple Redis Cluster instances
to communicate when running on the same interface
of the same host.
2014-03-11 11:09:37 +01:00
zhanghailei
503938022f refer to updateLRUClock's comment REDIS_LRU_CLOCK_MAX is 22 bits,but #define REDIS_LRU_CLOCK_MAX ((1<<21)-1) only 21 bits 2014-03-11 11:09:37 +01:00
zhanghailei
7eec424953 FIXED a typo more thank should be more than 2014-03-11 11:09:37 +01:00
zhanghailei
0abe98cb4d According to context,the size should be 16 rather than 64 2014-03-11 11:09:37 +01:00
Matt Stancliff
a0ea8f235e Cluster: error out quicker if port is unusable
The default cluster control port is 10,000 ports higher than
the base Redis port.  If Redis is started on a too-high port,
Cluster can't start and everything will exit later anyway.
2014-03-11 11:09:37 +01:00
Matt Stancliff
6f4b5ef6d5 Fix "can't bind to address" error reporting.
Report the actual port used for the listening attempt instead of
server.port.

Originally, Redis would just listen on server.port.
But, with clustering, Redis uses a Cluster Port too,
so we can't say server.port is always where we are listening.

If you tried to launch Redis with a too-high port number (any
port where Port+10000 > 65535), Redis would refuse to start, but
only print an error saying it can't connect to the Redis port.

This patch fixes much confusions.
2014-03-11 11:09:37 +01:00
antirez
4d5ba5962c Cast saveparams[].seconds to long for %ld format specifier. 2014-03-05 11:26:46 +01:00
antirez
313f8831ed Sentinel: more aggressive failover start desynchronization.
Sentinel needs to avoid split brain conditions due to multiple sentinels
trying to get voted at the exact same time.

So far some desynchronization was provided by fluctuating server.hz,
that is the frequency of the timer function call. However the
desynchonization provided in this way was not enough when using many
Sentinel instances, especially when a large quorum value is used in
order to force a greater degree of agreement (more than N/2+1).

It was verified that it was likely to trigger a split brain
condition, forcing the system to try again after a timeout.
Usually the system will succeed after a few retries, but this is not
optimal.

This commit desynchronizes instances in a more effective way to make it
likely that the first attempt will be successful.
2014-03-05 10:22:07 +01:00
antirez
5ee2394474 CONFIG REWRITE should be logged at WARNING level. 2014-03-05 10:22:07 +01:00
antirez
d2e16801f0 Cluster: invalidate current transaction on redirections. 2014-03-05 10:22:07 +01:00
antirez
1d9eb47f9d Document why we update peak memory in INFO. 2014-03-05 10:22:07 +01:00
antirez
e4833ed8bf Fix configEpoch assignment when a cluster slot gets "closed".
This is still code to rework in order to use agreement to obtain a new
configEpoch when a slot is migrated, however this commit handles the
special case that happens when the nodes are just started and everybody
has a configEpoch of 0. In this special condition to have the maximum
configEpoch is not enough as the special epoch 0 is not unique (all the
others are).

This does not fixes the intrinsic race condition of a failover happening
while we are resharding, that will be addressed later.
2014-03-05 10:22:07 +01:00
Matt Stancliff
7c092b679f Force INFO used_memory_peak to match peak memory
used_memory_peak only updates in serverCron every server.hz,
but Redis can use more memory and a user can request memory
INFO before used_memory_peak gets updated in the next
cron run.

This patch updates used_memory_peak to the current
memory usage if the current memory usage is higher
than the recorded used_memory_peak value.

(And it only calls zmalloc_used_memory() once instead of
twice as it was doing before.)
2014-03-05 10:22:07 +01:00
michael-grunder
23addbb5a3 Improved bigkeys with progress, pipelining and summary
This commit reworks the redis-cli --bigkeys command to provide more
information about our progress as well as output summary information
when we're done.

 - We now show an approximate percentage completion as we go
 - Hiredis pipelining is used for TYPE and SIZE retreival
 - A summary of keyspace distribution and overall breakout at the end
2014-03-05 10:22:07 +01:00
antirez
a46811693d Sentinel test: Makefile target added. 2014-02-28 16:00:14 +01:00
antirez
9104f1e672 warnigns -> warnings in redisBitpos(). 2014-02-27 15:52:43 +01:00
antirez
eacc0951a2 More consistent BITPOS behavior with bit=0 and ranges.
With the new behavior it is possible to specify just the start in the
range (the end will be assumed to be the first byte), or it is possible
to specify both start and end.

This is useful to change the behavior of the command when looking for
zeros inside a string.

1) If the user specifies both start and end, and no 0 is found inside
   the range, the command returns -1.

2) If instead no range is specified, or just the start is given, even
   if in the actual string no 0 bit is found, the command returns the
   first bit on the right after the end of the string.

So for example if the string stored at key foo is "\xff\xff":

    BITPOS foo (returns 16)
    BITPOS foo 0 -1 (returns -1)
    BITPOS foo 0 (returns 16)

The idea is that when no end is given the user is just looking for the
first bit that is zero and can be set to 1 with SETBIT, as it is
"available". Instead when a specific range is given, we just look for a
zero within the boundaries of the range.
2014-02-27 15:52:43 +01:00
antirez
1f8005ca09 Initial implementation of BITPOS.
It appears to work but more stress testing, and both unit tests and
fuzzy testing, is needed in order to ensure the implementation is sane.
2014-02-27 15:52:43 +01:00
antirez
24265edb6c Fix misaligned word access in redisPopcount(). 2014-02-27 15:52:43 +01:00
Matt Stancliff
7e274194bf Fix IP representation in clusterMsgDataGossip 2014-02-27 15:52:43 +01:00
michael-grunder
a2b3f2eae2 Update --bigkeys to use SCAN
This commit changes the findBigKeys() function in redis-cli.c to use the new
SCAN command for iterating the keyspace, rather than RANDOMKEY.  Because we
can know when we're done using SCAN, it will exit after exhausting the keyspace.
2014-02-25 15:09:46 +01:00
antirez
48fa34bf9e redis-cli: also remove useless uint8_t. 2014-02-25 15:09:46 +01:00
antirez
ff20d05650 redis-cli: don't use uint64_t where actually not needed.
The computation is just something to take the CPU busy, no need to use a
specific type. Since stdint.h was not included this prevented
compilation on certain systems.
2014-02-25 15:09:46 +01:00
antirez
6a95ddb248 redis-cli: check argument existence for --pattern. 2014-02-25 15:09:46 +01:00
antirez
2431b63ff4 redis-cli: --intrinsic-latency run mode added. 2014-02-25 15:09:46 +01:00
antirez
68e9597e8a redis-cli: added comments to split program in parts. 2014-02-25 15:09:46 +01:00
antirez
4f6ed3412e Sentinel: log quorum with +monitor event. 2014-02-25 10:24:16 +01:00
antirez
56e18ba4f6 Sentinel: generate +monitor events at startup. 2014-02-25 10:24:16 +01:00
antirez
cd68a1d45a Sentinel: log +monitor and +set events.
Now that we have a runtime configuration system, it is very important to
be able to log how the Sentinel configuration changes over time because
of API calls.
2014-02-25 10:24:16 +01:00
antirez
4af2acf2b0 Sentinel: added missing exit(1) after checking for config file. 2014-02-25 10:24:16 +01:00
antirez
6e2e6d5b8c Sentinel: IDONTKNOW error removed.
This error was conceived for the older version of Sentinel that worked
via master redirection and that was not able to get configuration
updates from other Sentinels via the Pub/Sub channel of masters or
slaves.

This reply does not make sense today, every Sentinel should reply with
the best information it has currently. The error will make even more
sense in the future since the plan is to allow Sentinels to update the
configuration of other Sentinels via gossip with a direct chat without
the prerequisite that they have at least a monitored instance in common.
2014-02-25 10:24:16 +01:00
Matt Stancliff
5a8c9f94a6 Add cluster or sentinel to proc title
If you launch redis with `redis-server --sentinel` then
in a ps, your output only says "redis-server IP:Port" — this
patch changes the proc title to include [sentinel] or
[cluster] depending on the current server mode:
e.g.  "redis-server IP:Port [sentinel]"
      "redis-server IP:Port [cluster]"
2014-02-25 10:24:16 +01:00
Matt Stancliff
21a7f9e7ef Auto-enter slaveMode when SYNC from redis-cli
If someone asks for SYNC or PSYNC from redis-cli,
automatically enter slaveMode (as if they ran
redis-cli --slave) and continue printing the replication
stream until either they Ctrl-C or the master gets disconnected.
2014-02-25 10:24:16 +01:00
antirez
48d74f2039 Sentinel: report instances role switch events.
This is useful mostly for debugging of issues.
2014-02-20 12:28:18 +01:00
antirez
6e4662e479 Sentinel: SENTINEL_SLAVE_RECONF_RETRY_PERIOD -> RECONF_TIMEOUT
Rename define to match the new meaning.
2014-02-18 10:30:47 +01:00
antirez
bd31fcf16e Sentinel: fix slave promotion timeout.
If we can't reconfigure a slave in time during failover, go forward as
anyway the slave will be fixed by Sentinels in the future, once they
detect it is misconfigured.

Otherwise a failover in progress may never terminate if for some reason
the slave is uncapable to sync with the master while at the same time
it is not disconnected.
2014-02-18 10:30:47 +01:00
antirez
58b6dd9beb Get absoulte config file path before processig 'dir'.
The code tried to obtain the configuration file absolute path after
processing the configuration file. However if config file was a relative
path and a "dir" statement was processed reading the config, the absolute
path obtained was wrong.

With this fix the absolute path is obtained before processing the
configuration while the server is still in the original directory where
it was executed.
2014-02-17 17:39:12 +01:00
antirez
c36a5dce54 Sentinel: better specify startup errors due to config file.
Now it logs the file name if it is not accessible. Also there is a
different error for the missing config file case, and for the non
writable file case.
2014-02-17 17:39:09 +01:00
antirez
3c1672da7d Update cached time in rdbLoad() callback.
server.unixtime and server.mstime are cached less precise timestamps
that we use every time we don't need an accurate time representation and
a syscall would be too slow for the number of calls we require.

Such an example is the initialization and update process of the last
interaction time with the client, that is used for timeouts.

However rdbLoad() can take some time to load the DB, but at the same
time it did not updated the time during DB loading. This resulted in the
bug described in issue #1535, where in the replication process the slave
loads the DB, creates the redisClient representation of its master, but
the timestamp is so old that the master, under certain conditions, is
sensed as already "timed out".

Thanks to @yoav-steinberg and Redis Labs Inc for the bug report and
analysis.
2014-02-13 15:13:38 +01:00
antirez
116617c5e7 Log when CONFIG REWRITE goes bad. 2014-02-13 14:33:53 +01:00
antirez
14143fbede Fix script cache bug in the scripting engine.
This commit fixes a serious Lua scripting replication issue, described
by Github issue #1549. The root cause of the problem is that scripts
were put inside the script cache, assuming that slaves and AOF already
contained it, even if the scripts sometimes produced no changes in the
data set, and were not actaully propagated to AOF/slaves.

Example:

    eval "if tonumber(KEYS[1]) > 0 then redis.call('incr', 'x') end" 1 0

Then:

    evalsha <sha1 step 1 script> 1 0

At this step sha1 of the script is added to the replication script cache
(the script is marked as known to the slaves) and EVALSHA command is
transformed to EVAL. However it is not dirty (there is no changes to db),
so it is not propagated to the slaves. Then the script is called again:

    evalsha <sha1 step 1 script> 1 1

At this step master checks that the script already exists in the
replication script cache and doesn't transform it to EVAL command. It is
dirty and propagated to the slaves, but they fail to evaluate the script
as they don't have it in the script cache.

The fix is trivial and just uses the new API to force the propagation of
the executed command regardless of the dirty state of the data set.

Thank you to @minus-infinity on Github for finding the issue,
understanding the root cause, and fixing it.
2014-02-13 12:16:31 +01:00