458 Commits

Author SHA1 Message Date
Matt Stancliff
6826af1b50 Replace magic 32 with REDIS_EVENTLOOP_FDSET_INCR
32 was the additional number of file descriptors Redis
would reserve when managing a too-low ulimit.  The
number 32 was in too many places statically, so now
we use a macro instead that looks more appropriate.

When Redis sets up the server event loop, it uses:
    server.maxclients+REDIS_EVENTLOOP_FDSET_INCR

So, when reserving file descriptors, it makes sense to
reserve at least REDIS_EVENTLOOP_FDSET_INCR FDs instead
of only 32.  Currently, REDIS_EVENTLOOP_FDSET_INCR is
set to 128 in redis.h.

Also, I replaced the static 128 in the while f < old loop
with REDIS_EVENTLOOP_FDSET_INCR as well, which results
in no change since it was already 128.

Impact: Users now need at least maxclients+128 as
their open file limit instead of maxclients+32 to obtain
actual "maxclients" number of clients.  Redis will carve
the extra REDIS_EVENTLOOP_FDSET_INCR file descriptors it
needs out of the "maxclients" range instead of failing
to start (unless the local ulimit -n is too low to accomidate
the request).
2014-03-25 09:07:21 +01:00
antirez
4ebc7e3738 Sample and cache RSS in serverCron().
Obtaining the RSS (Resident Set Size) info is slow in Linux and OSX.
This slowed down the generation of the INFO 'memory' section.

Since the RSS does not require to be a real-time measurement, we
now sample it with server.hz frequency (10 times per second by default)
and use this value both to show the INFO rss field and to compute the
fragmentation ratio.

Practically this does not make any difference for memory profiling of
Redis but speeds up the INFO call significantly.
2014-03-24 12:03:44 +01:00
antirez
7054f2a029 Cache uname() output across INFO calls.
Uname was profiled to be a slow syscall. It produces always the same
output in the context of a single execution of Redis, so calling it at
every INFO output generation does not make too much sense.

The uname utsname structure was modified as a static variable. At the
same time a static integer was added to check if we need to call uname
the first time.
2014-03-24 10:02:58 +01:00
antirez
e3b71a1c02 Unify stats reset for CONFIG RESETSTAT / initServer().
Now CONFIG RESETSTAT makes sure to reset all the fields, and in the
future it will be simpler to avoid missing new fields.
2014-03-21 11:16:12 +01:00
antirez
55b8f6ec1c Document why we update peak memory in INFO. 2014-03-05 10:16:20 +01:00
Matt Stancliff
647a261465 Force INFO used_memory_peak to match peak memory
used_memory_peak only updates in serverCron every server.hz,
but Redis can use more memory and a user can request memory
INFO before used_memory_peak gets updated in the next
cron run.

This patch updates used_memory_peak to the current
memory usage if the current memory usage is higher
than the recorded used_memory_peak value.

(And it only calls zmalloc_used_memory() once instead of
twice as it was doing before.)
2014-03-05 10:16:16 +01:00
antirez
25e2791ec3 Initial implementation of BITPOS.
It appears to work but more stress testing, and both unit tests and
fuzzy testing, is needed in order to ensure the implementation is sane.
2014-02-27 15:54:31 +01:00
antirez
4237f14a8a Get absoulte config file path before processig 'dir'.
The code tried to obtain the configuration file absolute path after
processing the configuration file. However if config file was a relative
path and a "dir" statement was processed reading the config, the absolute
path obtained was wrong.

With this fix the absolute path is obtained before processing the
configuration while the server is still in the original directory where
it was executed.
2014-02-17 12:14:19 +01:00
antirez
85492dcfef Update cached time in rdbLoad() callback.
server.unixtime and server.mstime are cached less precise timestamps
that we use every time we don't need an accurate time representation and
a syscall would be too slow for the number of calls we require.

Such an example is the initialization and update process of the last
interaction time with the client, that is used for timeouts.

However rdbLoad() can take some time to load the DB, but at the same
time it did not updated the time during DB loading. This resulted in the
bug described in issue #1535, where in the replication process the slave
loads the DB, creates the redisClient representation of its master, but
the timestamp is so old that the master, under certain conditions, is
sensed as already "timed out".

Thanks to @yoav-steinberg and Redis Labs Inc for the bug report and
analysis.
2014-02-13 15:13:35 +01:00
antirez
96973a7c33 AOF write error: retry with a frequency of 1 hz. 2014-02-12 16:57:17 +01:00
antirez
fadbbdd3f4 AOF: don't abort on write errors unless fsync is 'always'.
A system similar to the RDB write error handling is used, in which when
we can't write to the AOF file, writes are no longer accepted until we
are able to write again.

For fsync == always we still abort on errors since there is currently no
easy way to avoid replying with success to the user otherwise, and this
would violate the contract with the user of only acknowledging data
already secured on disk.
2014-02-12 16:57:13 +01:00
antirez
688d32e16b Don't count time to feed MONITORs in SLOWLOG. 2014-02-07 18:29:26 +01:00
antirez
3e4968339b Sentinel: allow SHUTDOWN command in Sentinel mode. 2014-02-07 11:22:30 +01:00
antirez
5201ca0ca1 Allow CONFIG and SHUTDOWN while in stale-slave state. 2014-02-03 15:51:07 +01:00
antirez
917b851491 Option "backlog" renamed "tcp-backlog".
This is especially important since we already have a concept of backlog
(the replication backlog).
2014-01-31 15:03:22 +01:00
Nenad Merdanovic
8dda9dbef0 Add support for listen(2) backlog definition
In high RPS environments, the default listen backlog is not sufficient, so
giving users the power to configure it is the right approach, especially
since it requires only minor modifications to the code.
2014-01-31 15:03:19 +01:00
antirez
89a731b3aa Fixed inverted if condition in MISCONF error code path. 2014-01-28 10:10:56 +01:00
antirez
c6db326d1d Configuring port to 0 disables IP socket as specified.
This was no longer the case with 2.8 becuase of a bug introduced with
the IPv6 support. Now it is fixed.

This fixes issue #1287 and #1477.
2013-12-23 11:34:15 +01:00
Yubao Liu
d62bbfda16 CONFIG REWRITE: don't throw some options on config rewrite
Those options will be thrown without this patch:
  include, rename-command, min-slaves-to-write, min-slaves-max-lag,
appendfilename.
2013-12-19 16:03:37 +01:00
antirez
9a8ae5a553 Replication: publish the slave_repl_offset when disconnected from master.
When a slave was disconnected from its master the replication offset was
reported as -1. Now it is reported as the replication offset of the
previous master, so that failover can be performed using this value in
order to try to select a slave with more processed data from a set of
slaves of the old master.
2013-12-11 15:24:50 +01:00
antirez
7f6743a581 Fixed grammar: before H the article is a, not an. 2013-12-05 16:37:21 +01:00
antirez
83333b08d0 Stop writes on MISCONF only if instance is a master.
From the point of view of the slave not accepting writes from the master
can only create a bigger consistency issue.
2013-11-28 16:25:49 +01:00
antirez
2eb8a46061 Reply to PING with error when there is a MISCONF state. 2013-11-28 16:16:58 +01:00
antirez
4bd1dd1c53 Sentinel: test for writable config file.
This commit introduces a funciton called when Sentinel is ready for
normal operations to avoid putting Sentinel specific stuff in redis.c.
2013-11-21 15:24:22 +01:00
antirez
812f76a850 Sentinel: distinguish between is-master-down-by-addr requests.
Some are just to know if the master is down, and in this case the runid
in the request is set to "*", others are actually in order to seek for a
vote and get elected. In the latter case the runid is set to the runid
of the instance seeking for the vote.
2013-11-21 15:22:55 +01:00
antirez
35816a2e4d ZSCAN implemented. 2013-10-29 16:21:18 +01:00
antirez
1cc7e646af HSCAN implemented. 2013-10-29 16:21:13 +01:00
antirez
24ecabc995 SSCAN implemented. 2013-10-29 16:21:01 +01:00
antirez
7f063b1c8c SCAN is a random command and does not require output sorting.
Sorting the output helps when we want to turn a non-deterministic into a
deterministic command, in that case this is not possible.
2013-10-29 16:20:10 +01:00
Pieter Noordhuis
8c4ff679fe SCAN requires at least 1 argument 2013-10-29 16:19:08 +01:00
Pieter Noordhuis
0dd95e23ab Add SCAN command 2013-10-29 16:18:24 +01:00
antirez
af967ff9cb Allow AUTH / PING when disconnected from slave and serve-stale-data is no. 2013-09-17 09:46:40 +02:00
antirez
9df63fcabf Only run the fast active expire cycle if master & enabled. 2013-08-27 09:31:43 +02:00
antirez
f5903846bc Opening TCP listening ports refactored into a function. 2013-08-22 14:06:54 +02:00
antirez
bd6327855b Print error message when can't bind * on any address. 2013-08-22 13:03:07 +02:00
antirez
2c94d80f58 Fix for issue #1214 simplified. 2013-08-21 11:41:35 +02:00
Allan
778f753deb fixed initServer fail while having no IPv6 nor IPv4 2013-08-21 11:41:30 +02:00
Allan
6c2b34a465 fixed initServer failed if no IPV4 or no IPV6 2013-08-21 11:41:25 +02:00
Allan
0a4656d63f fixed bug issue of #1213 2013-08-21 11:41:21 +02:00
antirez
fa48b1fa32 Add per-db average TTL information in INFO output.
Example:

db0:keys=221913,expires=221913,avg_ttl=655

The algorithm uses a running average with only two samples (current and
previous). Keys found to be expired are considered at TTL zero even if
the actual TTL can be negative.

The TTL is reported in milliseconds.
2013-08-06 15:36:43 +02:00
antirez
31d0f34100 activeExpireCycle(): fix about fast cycle early start.
We don't want to repeat a fast cycle too soon, the previous code was
broken, we need to wait two times the period *since* the start of the
previous cycle in order to avoid there is an even space between cycles:

.-> start                   .-> second start
|                           |
+-------------+-------------+--------------+
| first cycle |    pause    | second cycle |
+-------------+-------------+--------------+

The second and first start must be PERIOD*2 useconds apart hence the *2
in the new code.
2013-08-06 15:36:39 +02:00
antirez
00c8cfef74 Some activeExpireCycle() refactoring. 2013-08-06 15:36:35 +02:00
antirez
8686a70303 Remove dead code and fix comments for new expire code. 2013-08-06 15:36:31 +02:00
antirez
d54e373b84 Darft #2 for key collection algo: more improvements.
This commit makes the fast collection cycle time configurable, at
the same time it does not allow to run a new fast collection cycle
for the same amount of time as the max duration of the fast
collection cycle.
2013-08-06 15:36:27 +02:00
antirez
500155b91b Draft #1 of a new expired keys collection algorithm.
The main idea here is that when we are no longer to expire keys at the
rate the are created, we can't block more in the normal expire cycle as
this would result in too big latency spikes.

For this reason the commit introduces a "fast" expire cycle that does
not run for more than 1 millisecond but is called in the beforeSleep()
hook of the event loop, so much more often, and with a frequency bound
to the frequency of executed commnads.

The fast expire cycle is only called when the standard expiration
algorithm runs out of time, that is, consumed more than
REDIS_EXPIRELOOKUPS_TIME_PERC of CPU in a given cycle without being able
to take the number of already expired keys that are yet not collected
to a number smaller than 25% of the number of keys.

You can test this commit with different loads, but a simple way is to
use the following:

Extreme load with pipelining:

redis-benchmark -r 100000000 -n 100000000  \
        -P 32 set ele:rand:000000000000 foo ex 2

Remove the -P32 in order to avoid the pipelining for a more real-world
load.

In another terminal tab you can monitor the Redis behavior with:

redis-cli -i 0.1 -r -1 info keyspace

and

redis-cli --latency-history

Note: this commit will make Redis printing a lot of debug messages, it
is not a good idea to use it in production.
2013-08-06 15:36:23 +02:00
antirez
3cb3714e99 Redis 2.7.101 (2.8 Release Candidate 1). 2013-07-18 11:26:53 +02:00
yoav
9f6f436a51 Chunked loading of RDB to prevent redis from stalling reading very large keys. 2013-07-16 15:41:59 +02:00
antirez
98757b4a6e Use the environment locale for strcoll() collation. 2013-07-12 13:39:48 +02:00
antirez
a7451c1b6d All IP string repr buffers are now REDIS_IP_STR_LEN bytes. 2013-07-11 17:07:52 +02:00
antirez
2e75d3947c IPv6: bind IPv4 and IPv6 interfaces by default. 2013-07-11 17:07:41 +02:00