1440 Commits

Author SHA1 Message Date
antirez
edb952532d Allow an AOF rewrite buffer > 2GB (Fix for issue #504).
During the AOF rewrite process, the parent process needs to accumulate
the new writes in an in-memory buffer: when the child will terminate the
AOF rewriting process this buffer (that ist the difference between the
dataset when the rewrite was started, and the current dataset) is
flushed to the new AOF file.

We used to implement this buffer using an sds.c string, but sds.c has a
2GB limit. Sometimes the dataset can be big enough, the amount of writes
so high, and the rewrite process slow enough that we overflow the 2GB
limit, causing a crash, documented on github by issue #504.

In order to prevent this from happening, this commit introduces a new
system to accumulate writes, implemented by a linked list of blocks of
10 MB each, so that we also avoid paying the reallocation cost.

Note that theoretically modern operating systems may implement realloc()
simply as a remaping of the old pages, thus with very good performances,
see for instance the mremap() syscall on Linux. However this is not
always true, and jemalloc by default avoids doing this because there are
issues with the current implementation of mremap().

For this reason we are using a linked list of blocks instead of a single
block that gets reallocated again and again.

The changes in this commit lacks testing, that will be performed before
merging into the unstable branch. This fix will not enter 2.4 because it
is too invasive. However 2.4 will log a warning when the AOF rewrite
buffer is near to the 2GB limit.
2012-05-24 15:22:35 +02:00
antirez
8152d0c046 Dead code removed from replication.c.
The user @jokea noticed that the following line of code into
replication.c made little sense:

    addReplySds(slave,sdsempty());

Investigating a bit I found that this was introduced by commit 6208b3a7
three years ago in the early stages of Redis. The code apparently is not
useful at all, so I'm removing it.

This change will not be backported into 2.4 so that in the rare case
this should introduce a bug, we'll have a chance to detect it into the
development branch. However following the code path it seems like the
code is not useful at all, so the risk is truly small.
2012-05-24 15:22:31 +02:00
antirez
27fc5bf582 Use comments to split aof.c into sections.
This makes the code more readable, it is still not the case to split the
file itself into three different files, but the logical separation
improves the readability especially since new commits are going to
introduce an additional section.
2012-05-23 11:52:40 +02:00
jokea
a5a037bf81 Set fd to writable when poll(2) detects POLLERR or POLLHUP event. 2012-05-23 11:33:25 +02:00
antirez
4dada1b5bc Fixed issue #516 (ZINTERSTORE mixing sets and zsets).
Weeks ago trying to fix an harmless GCC warning I introduced a bug in
the ziplist-encoded implementations of sorted sets.

The bug completely broke zuiNext() iterator, that is used in the
ZINTERSTORE and ZUNIONSTORE implementation, so those two commands are no
longer reliable starting from Redis version 2.4.12 and latest 2.6.0-RC
releases.

This commit fixes the problem and adds a regression test.
2012-05-23 11:12:38 +02:00
antirez
4934f93dfb Jemalloc updated to 3.0.0.
Full changelog here:

http://www.canonware.com/cgi-bin/gitweb.cgi?p=jemalloc.git;a=blob_plain;f=ChangeLog;hb=master

Notable improvements from the point of view of Redis:

1) Bugfixing.
2) Support for Valgrind.
3) Support for OSX Lion, FreeBSD.
2012-05-16 11:20:44 +02:00
Pieter Noordhuis
d318803f94 Whitespace 2012-05-14 11:06:34 -07:00
Dave Pacheco
72f30bcd30 use port_getn instead of port_get 2012-05-14 11:01:35 -07:00
Dave Pacheco
2aa1efb8a5 first cut at event port support 2012-05-14 11:01:33 -07:00
antirez
e67d014d9a Added time.h include in redis-cli.
redis-cli.c uses the time() function to seed the PRNG, but time.h was
not included. This was not noticed since sys/time.h is included and was
enough in most systems (but not correct). With Ubuntu 12.04 GCC
generates a warning that made us aware of the issue.
2012-05-14 17:43:31 +02:00
antirez
eee680541b activeExpireCycle(): better precision in max time used.
activeExpireCycle() can consume no more than a few milliseconds per
iteration. This commit improves the precision of the check for the time
elapsed in two ways:

1) We check every 16 iterations instead of the main loop instead of 256.
2) We reset iterations at the start of the function and not every time
   we switch to the next database, so the check is correctly performed
   every 16 iterations.
2012-05-14 17:43:26 +02:00
antirez
a8a981a834 Impovements for: Redis timer, hashes rehashing, keys collection.
A previous commit introduced REDIS_HZ define that changes the frequency
of calls to the serverCron() Redis function. This commit improves
different related things:

1) Software watchdog: now the minimal period can be set according to
REDIS_HZ. The minimal period is two times the timer period, that is:

    (1000/REDIS_HZ)*2 milliseconds

2) The incremental rehashing is now performed in the expires dictionary
as well.

3) The activeExpireCycle() function was improved in different ways:

- Now it checks if it already used too much time using microseconds
  instead of milliseconds for better precision.
- The time limit is now calculated correctly, in the previous version
  the division was performed before of the multiplication resulting in
  a timelimit of 0 if HZ was big enough.
- Databases with less than 1% of buckets fill in the hash table are
  skipped, because getting random keys is too expensive in this
  condition.

4) tryResizeHashTables() is now called at every timer call, we need to
   match the number of calls we do to the expired keys colleciton cycle.

5) REDIS_HZ was raised to 100.
2012-05-14 17:43:23 +02:00
antirez
f7f2b2610e Redis timer interrupt frequency configurable as REDIS_HZ.
Redis uses a function called serverCron() that is very similar to the
timer interrupt of an operating system. This function is used to handle
a number of asynchronous things, like active expired keys collection,
clients timeouts, update of statistics, things related to the cluster
and replication, triggering of BGSAVE and AOF rewrite process, and so
forth.

In the past the timer was called 1 time per second. At some point it was
raised to 10 times per second, but it still was fixed and could not be
changed even at compile time, because different functions called from
serverCron() assumed a given fixed frequency.

This commmit makes the frequency configurable, so that it is simpler to
pick a good tradeoff between overhead of this function (that is usually
very small) and the responsiveness of Redis during a few critical
circumstances where a lot of work is done inside the timer.

An example of such a critical condition is mass-expire of a lot of keys
in the same second. Up to a given percentage of CPU time is used to
perform expired keys collection per expire cylce. Now changing the
REDIS_HZ macro it is possible to do less work but more times per second
in order to block the server for less time.

If this patch will work well in our tests it will enter Redis 2.6-final.
2012-05-14 17:43:07 +02:00
antirez
f078d5628d Comment improved so that the code goal is more clear. Thx to @agladysh. 2012-05-12 09:33:29 +02:00
antirez
3a401464e9 More incremental active expired keys collection process.
If a large amonut of keys are all expiring about at the same time, the
"active" expired keys collection cycle used to block as far as the
percentage of already expired keys was >= 25% of the total population of
keys with an expire set.

This could block the server even for many seconds in order to reclaim
memory ASAP. The new algorithm uses at max a small amount of
milliseconds per cycle, even if this means reclaiming the memory less
promptly it also means a more responsive server.
2012-05-12 09:33:24 +02:00
antirez
25496f4700 redis-cli pipe mode: handle EINTR properly as well so that SIGSTOP/SIGCONT are handled correctly. 2012-05-12 09:33:15 +02:00
antirez
346825c7ed redis-cli pipe mode: handle EAGAIN while writing to socket. 2012-05-12 09:33:11 +02:00
antirez
dd4e8203b2 redis-cli --pipe for mass import. 2012-05-12 09:33:06 +02:00
antirez
91d18504c2 Fix PREFIX typo in Makefile. 2012-05-09 20:45:13 +02:00
antirez
f580a3e3a0 Allow PREFIX to be overrided in Makefile. 2012-05-09 10:34:58 +02:00
antirez
184b8e78c6 Redis 2.5.9 (2.6 RC3). 2012-05-06 10:11:54 +02:00
Pieter Noordhuis
0ef889274f Compare integers in ziplist regardless of encoding
Because of the introduction of new integer encoding types for ziplists
in the 2.6 tree, the same integer value may have a different encoding in
different versions of the ziplist implementation. This means that the
encoding can NOT be used as a fast path in comparing integers.
2012-05-04 17:26:24 -07:00
antirez
0cf10e8e86 syncio.c read / write functions reworked for correctness and performance.
The new implementation start reading / writing before blocking with
aeWait(), likely the descriptor can accept writes or has buffered data
inside and we can go faster, otherwise we get an error and wait.

This change has effects on speed but also on correctness: on socket
errors when we perform non blocking connect(2) write is performed ASAP
and the error is returned ASAP before waiting.

So the practical effect is that now a Redis slave is more available if it
can not connect to the master, previously the slave continued to block on
syncWrite() trying to send SYNC, and serving commands very slowly.
2012-05-02 22:45:12 +02:00
antirez
9b43b1ef4d Remove useless trailing space in SYNC command sent to master. 2012-05-02 21:48:08 +02:00
antirez
0b08d64882 Use specific error if master is down and slave-serve-stale-data is set to no.
We used to reply -ERR ... message ..., now the reply is
instead -MASTERDOWN ... message ... so that it can be distinguished
easily by the other error conditions.
2012-05-02 17:14:45 +02:00
antirez
0f07781538 Redis 2.5.8 (2.6.0 RC2). 2012-05-02 12:17:21 +02:00
Pieter Noordhuis
9311d2b527 Use safe dictionary iterator from KEYS
Every matched key in a KEYS call is checked for expiration. When the key
is set to expire, the call to `getExpire` will assert that the key also
exists in the main dictionary. This in turn causes a rehashing step to
be executed. Rehashing a dictionary when there is an iterator active may
result in the iterator emitting duplicate entries, or not emitting some
entries at all. By using a safe iterator, the rehash step is omitted.
2012-04-30 10:16:20 -07:00
antirez
7c5d96d98e Redis 2.5.7 (2.6 RC1) 2012-04-27 16:40:07 +02:00
antirez
603adb2b29 memtest.c fixed to actually use v1 and v2 in memtest_fill_value(). 2012-04-27 16:28:31 +02:00
antirez
2c0aae760f redis-cli commands description in help.h updated. 2012-04-27 15:57:17 +02:00
antirez
b1aa7183d7 Update makefile dependencies. 2012-04-27 15:56:16 +02:00
antirez
77a75fdef5 Set LUA_MASKCOUNT hook more selectively. Fixes issue #480.
An user reported a crash with Redis scripting (see issue #480 on
github), inspection of the kindly provided strack trace showed that
server.lua_caller was probably set to NULL. The stack trace also slowed
that the call to the hook was originating from a point where we just
used to set/get a few global variables in the Lua state.

What was happening is that we did not set the timeout hook selectively
only when the user script was called. Now we set it more selectively,
specifically only in the context of the lua_pcall() call, and make sure
to remove the hook when the call returns. Otherwise the hook can get
called in random contexts every time we do something with the Lua
state.
2012-04-27 11:47:30 +02:00
antirez
e69e76d442 Re-introduce -g -rdynamic -ggdb when linking, fixing strack traces.
A previous commit removed -g -rdynamic -ggdb as LDFLAGS, not allowing
Redis to produce a stack trace wth symbol names on crash.
This commit fixes the issue.
2012-04-27 11:47:27 +02:00
antirez
9db4cea235 Produce the stack trace in an async safe way. 2012-04-27 11:47:22 +02:00
antirez
a28ab2a92b Don't use an alternative stack for SIGSEGV & co.
This commit reverts most of c575766202773c858be0870c20cd495b722927c3, in
order to use back main stack for signal handling.

The main reason is that otherwise it is completely pointless that we do
a lot of efforts to print the stack trace on crash, and the content of
the stack and registers as well. Using an alternate stack broken this
feature completely.
2012-04-27 11:47:18 +02:00
David Tran
8111a803cb Spelling: s/synchrnonization/synchronization 2012-04-27 11:47:08 +02:00
antirez
12a042f863 redis-check-dump now is RDB version 6 ready. 2012-04-24 19:35:36 +02:00
antirez
717145c1a3 Spurious debugging printf removed. 2012-04-24 19:35:33 +02:00
antirez
dcd4efe9ef Added two new encodings to ziplist.c
1) One integer "immediate" encoding that can encode from 0 to 12 in the
encoding byte itself.
2) One 8 bit signed integer encoding that can encode 8 bit signed small
integers in a single byte.

The idea is to exploit all the not used bits we have around in a
backward compatible way.
2012-04-24 19:35:29 +02:00
antirez
62bfa66284 rdbLoad() should check REDIS_RDB_VERSION instead of hardcoded number. 2012-04-24 19:35:25 +02:00
antirez
dd51571cc3 ziplist.c: added comments about the new 24 bit encoding. 2012-04-24 19:35:22 +02:00
Grisha Trubetskoy
ad91404a32 Add a 24bit integer to ziplists to save one byte for ints that can
fit in 24 bits (thanks to antirez for catching and solving the two's compliment
bug).

Increment REDIS_RDB_VERSION to 6
2012-04-24 19:35:18 +02:00
antirez
9de5d4600a A few compiler warnings suppressed. 2012-04-24 19:35:03 +02:00
antirez
38b60dea54 Fix and refactoring of code used to get registers on crash.
This fixes compilation on FreeBSD (and possibly other systems) by
not using ucontext_t at all if HAVE_BACKTRACE is not defined.
Also the ifdefs to get the registers are modified to explicitly test for the
operating system in the first level, and the arch in the second level
of nesting.
2012-04-24 19:34:15 +02:00
antirez
537dafab84 Remove loadfile() access from the scripting engine. 2012-04-24 19:33:58 +02:00
antirez
e337b26050 Even inside #if 0 comments are comments. 2012-04-21 21:49:34 +02:00
antirez
590d55a206 Limit memory used by big SLOWLOG entries.
Two limits are added:

1) Up to SLOWLOG_ENTRY_MAX_ARGV arguments are logged.
2) Up to SLOWLOG_ENTRY_MAX_STRING bytes per argument are logged.
3) slowlog-max-len is set to 128 by default (was 1024).

The number of remaining arguments / bytes is logged in the entry
so that the user can understand better the nature of the logged command.
2012-04-21 21:00:33 +02:00
antirez
ca577d162a SHUTDOWN NOSAVE now can stop a non returning script. Issue #466. 2012-04-19 23:36:04 +02:00
antirez
d54943b76d Currenly not used code in dict.c commented out. 2012-04-18 23:56:15 +02:00
antirez
1d82bbd432 cr16.c removed from 2.6 branch, was not used. 2012-04-18 23:41:00 +02:00