6190 Commits

Author SHA1 Message Date
antirez
3669f96e11 Issue #4027: unify comment and modify return value in freeMemoryIfNeeded().
It looks safer to return C_OK from freeMemoryIfNeeded() when clients are
paused because returning C_ERR may prevent success of writes. It is
possible that there is no difference in practice since clients cannot
execute writes while clients are paused, but it looks more correct this
way, at least conceptually.

Related to PR #4028.
2017-06-27 17:53:36 +02:00
Suraj Narkhede
896c4690dd Fix following issues in blocking commands:
1. brpop last key index, thus checking all keys for slots.
2. Memory leak in clusterRedirectBlockedClientIfNeeded.
3. Remove while loop in clusterRedirectBlockedClientIfNeeded.
2017-06-27 17:53:36 +02:00
Zachary Marquez
deeb795acc Prevent expirations and evictions while paused
Proposed fix to https://github.com/antirez/redis/issues/4027
2017-06-27 17:53:36 +02:00
antirez
a6615423e2 Upgrade 4.0 changelog with more backward incompatibilities. 2017-06-23 18:34:59 +02:00
xuzhou
0b367871c4 Optimize set command with ex/px when updating aof. 2017-06-22 11:01:27 +02:00
antirez
2ae733d924 redis-benchmark: add -t hset target. 2017-06-22 11:01:27 +02:00
xuzhou
63e1c9f224 Fix set with ex/px option when propagated to aof 2017-06-22 11:01:27 +02:00
minghang.zmh
0231156f6a fix server.stat_net_output_bytes calc bug 2017-06-20 17:03:33 +02:00
xuchengxuan
e99954e4d4 Fixed comments of slowlog duration 2017-06-20 16:56:47 +02:00
cbgbt
d048f9721c cli: Only print elapsed time on OUTPUT_STANDARD 2017-06-20 16:54:44 +02:00
Aric Huang
b5f22939c2 (fix) Update create-cluster README
Fix a few typos/adjust wording in `create-cluster` README
2017-06-20 16:54:44 +02:00
antirez
0b7ba621b7 SLOWLOG: log offending client address and name. 2017-06-15 17:02:16 +02:00
Antonio Mallia
1fbc90fe03 Removed duplicate 'sys/socket.h' include 2017-06-15 17:02:16 +02:00
Antonio Mallia
c7a6b711f3 Fixed comment in clusterMsg version field 2017-06-15 17:02:16 +02:00
Qu Chen
73d358f79f Implement getKeys procedure for georadius and georadiusbymember
commands.
2017-06-14 18:16:24 +02:00
antirez
c782d1894b Fix PERSIST expired key resuscitation issue #4048. 2017-06-13 10:37:28 +02:00
antirez
cb548bf3c0 More informative -MISCONF error message. 2017-05-19 12:04:11 +02:00
antirez
8cd6a2bd86 Collect fork() timing info only if fork succeeded. 2017-05-19 12:03:54 +02:00
antirez
a3941aa569 redis-cli --bigkeys: show error when TYPE fails.
Close #3993.
2017-05-15 11:23:55 +02:00
antirez
6b21cebd3d Modules TSC: use atomic var for server.unixtime.
This avoids Helgrind complaining, but we are actually not using
atomicGet() to get the unixtime value for now: too many places where it
is used and given tha time_t is word-sized it should be safe in all the
archs we support as it is.

On the other hand, Helgrind, when Redis is compiled with "make helgrind"
in order to force the __sync macros, will detect the write in
updateCachedTime() as a read (because atomic functions are used) and
will not complain about races.

This commit also includes minor refactoring of mutex initializations and
a "helgrind" target in the Makefile.
2017-05-11 16:44:46 +02:00
antirez
54bd224f0e atomicvar.h: show used API in INFO. Add macro to force __sync builtin.
The __sync builtin can be correctly detected by Helgrind so to force it
is useful for testing. The API in the INFO output can be useful for
debugging after problems are reported.
2017-05-11 16:44:46 +02:00
antirez
a864d25c6e zmalloc.c: remove thread safe mode, it's the default way. 2017-05-11 16:44:46 +02:00
antirez
b338f2b908 Modules TSC: Add mutex for server.lruclock.
Only useful for when no atomic builtins are available.
2017-05-11 16:44:46 +02:00
antirez
7e9c658d13 Modules TSC: Improve inter-thread synchronization.
More work to do with server.unixtime and similar. Need to write Helgrind
suppression file in order to suppress the valse positives.
2017-05-11 16:44:46 +02:00
antirez
e69af32fd7 Simplify atomicvar.h usage by having the mutex name implicit. 2017-05-11 16:44:46 +02:00
antirez
26e57f177d Lazyfree: fix lazyfreeGetPendingObjectsCount() race reading counter. 2017-05-11 16:44:46 +02:00
antirez
2acf003c05 Modules TSC: HELLO.KEYS reply format fixed. 2017-05-11 16:44:46 +02:00
antirez
12fd298fe7 Modules TSC: put the client in the pending write list. 2017-05-11 16:44:26 +02:00
antirez
5b1afa4a22 adlist: fix final list count in listJoin(). 2017-05-11 16:44:26 +02:00
antirez
717b2eeab3 adlist: fix listJoin() to handle empty lists. 2017-05-11 16:44:26 +02:00
antirez
a839036a1d Modules: remove unused var in example module. 2017-05-11 16:44:26 +02:00
antirez
eda5ee5e91 Modules TSC: HELLO.KEYS example draft finished. 2017-05-11 16:44:26 +02:00
antirez
fb8734fe9c Module: fix RedisModule_Call() "l" specifier to create a raw string. 2017-05-11 16:44:26 +02:00
antirez
c4b884958e Modules TSC: Release the GIL for all the time we are blocked.
Instead of giving the module background operations just a small time to
run in the beforeSleep() function, we can have the lock released for all
the time we are blocked in the multiplexing syscall.
2017-05-11 16:44:26 +02:00
antirez
fcd9a07df0 Modules TSC: Export symbols of the new API. 2017-05-11 16:44:26 +02:00
antirez
8affa3e78f Modules TSC: Handling of RM_Reply* functions. 2017-05-11 16:44:03 +02:00
antirez
31b1f3c1ae Modules TSC: Basic TS context creeation and handling. 2017-05-11 16:44:03 +02:00
antirez
74f3a84390 Modules TSC: GIL and cooperative multi tasking setup. 2017-05-11 16:44:03 +02:00
antirez
5021fda2b9 Regression test for #3899 fixed. 2017-05-11 16:43:46 +02:00
antirez
166bdbda03 Regression test for PSYNC2 issue #3899 added.
Experimentally verified that it can trigger the issue reverting the fix.
At least on my system... Being the bug time/backlog dependant, it is
very hard to tell if this test will be able to trigger the problem
consistently, however even if it triggers the problem once in a while,
we'll see it in the CI environment at http://ci.redis.io.
2017-04-28 10:40:44 +02:00
antirez
b506eb74ac Check event loop creation return value. Fix #3951.
Normally we never check for OOM conditions inside Redis since the
allocator will always return a pointer or abort the program on OOM
conditons. However we cannot have control on epool_create(), that may
fail for kernel OOM (according to the manual page) even if all the
parameters are correct, so the function aeCreateEventLoop() may indeed
return NULL and this condition must be checked.
2017-04-28 10:40:44 +02:00
antirez
806905627c PSYNC2: fix master cleanup when caching it.
The master client cleanup was incomplete: resetClient() was missing and
the output buffer of the client was not reset, so pending commands
related to the previous connection could be still sent.

The first problem caused the client argument vector to be, at times,
half populated, so that when the correct replication stream arrived the
protcol got mixed to the arugments creating invalid commands that nobody
called.

Thanks to @yangsiran for also investigating this problem, after
already providing important design / implementation hints for the
original PSYNC2 issues (see referenced Github issue).

Note that this commit adds a new function to the list library of Redis
in order to be able to reset a list without destroying it.

Related to issue #3899.
2017-04-27 17:08:53 +02:00
antirez
8c4b0f411f Defrag: test currently disabled, too many false positives.
Related to #3786.
2017-04-22 16:00:16 +02:00
antirez
6839c759b8 Reformat 4.0 RC3 change log. 2017-04-22 13:49:41 +02:00
antirez
51b12ed1b5 Defrag: fix test false positive.
Apparently 1.4 is too low compared to what you get in certain setups
(including mine). I raised it to 1.55 that hopefully is still enough to
test that the fragmentation went down from 1.7 but without incurring in
issues, however the test setup may be still fragile so certain times this
may lead to false positives again, it's hard to test for these things
in a determinsitic way.

Related to #3786.
4.0-rc3
2017-04-22 13:23:27 +02:00
antirez
635bbe573a Redis 4.0.0-RC3 (3.9.103). 2017-04-22 13:16:41 +02:00
oranagra
94a7090705 add test for active defrag 2017-04-22 13:16:01 +02:00
antirez
1a7a532e96 Revert "Jemalloc updated to 4.4.0."
This reverts commit 36c1acc222d29e6e2dc9fc25362e4faa471111bd.
2017-04-22 13:12:42 +02:00
antirez
6bc6bd4c38 PSYNC2: discard pending transactions from cached master.
During the review of the fix for #3899, @yangsiran identified an
implementation bug: given that the offset is now relative to the applied
part of the replication log, when we cache a master, the successive
PSYNC2 request will be made in order to *include* the transaction that
was not completely processed. This means that we need to discard any
pending transaction from our replication buffer: it will be re-executed.
2017-04-20 07:58:24 +02:00
antirez
a91cc5bc2d Fix PSYNC2 incomplete command bug as described in #3899.
This bug was discovered by @kevinmcgehee and constituted a major hidden
bug in the PSYNC2 implementation, caused by the propagation from the
master of incomplete commands to slaves.

The bug had several results:

1. Borrowing from Kevin text in the issue: "Given that slaves blindly
copy over their master's input into their own replication backlog over
successive read syscalls, it's possible that with large commands or
small TCP buffers, partial commands are present in this buffer. If the
master were to fail before successfully propagating the entire command
to a slave, the slaves will never execute the partial command (since the
client is invalidated) but will copy it to replication backlog which may
relay those invalid bytes to its slaves on PSYNC2, corrupting the
backlog and possibly other valid commands that follow the failover.
Simple command boundaries aren't sufficient to capture this, either,
because in the case of a MULTI/EXEC block, if the master successfully
propagates a subset of the commands but not the EXEC, then the
transaction in the backlog becomes corrupt and could corrupt other
slaves that consume this data."

2. As identified by @yangsiran later, there is another effect of the
bug. For the same mechanism of the first problem, a slave having another
slave, could receive a full resynchronization request with an already
half-applied command in the backlog. Once the RDB is ready, it will be
sent to the slave, and the replication will continue sending to the
sub-slave the other half of the command, which is not valid.

The fix, designed by @yangsiran and @antirez, and implemented by
@antirez, uses a secondary buffer in order to feed the sub-masters and
update the replication backlog and offsets, only when a given part of
the query buffer is actually *applied* to the state of the instance,
that is, when the command gets processed and the command is not pending
in the Redis transaction buffer because of CLIENT_MULTI state.

Given that now the backlog and offsets representation are in agreement
with the actual processed commands, both issue 1 and 2 should no longer
be possible.

Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in
identifying and designing a fix for this problem.
2017-04-20 07:58:22 +02:00