redis

mirror of https://github.com/fluencelabs/redis synced 2025-06-29 17:01:33 +00:00

Author	SHA1	Message	Date
Chris Lamb	afc4b36c10	Don't treat unsupported protocols as fatal errors If we encounter an unsupported protocol in the "bind" list, don't ipso-facto consider it a fatal error. We continue to abort startup if there are no listening sockets at all. This ensures that the lack of IPv6 support does not prevent Redis from starting on Debian where we try to bind to the ::1 interface by default (via "bind 127.0.0.1 ::1"). A machine with IPv6 disabled (such as some container systems) would simply fail to start Redis after the initiall call to apt(8). This is similar to the case for where "bind" is not specified: https://github.com/antirez/redis/issues/3894 ... and was based on the corresponding PR: https://github.com/antirez/redis/pull/4108 ... but also adds EADDRNOTAVAIL to the list of errors to catch which I believe is missing from there. This issue was raised in Debian as both <https://bugs.debian.org/900284> & <https://bugs.debian.org/914354>.	2018-12-21 12:40:28 +01:00
Oran Agra	9535c21515	fix rare replication stream corruption with disk-based replication The slave sends \n keepalive messages to the master while parsing the rdb, and later sends REPLCONF ACK once a second. rarely, the master recives both a linefeed char and a REPLCONF in the same read, \n3\r\n$8\r\nREPLCONF\r\n... and it tries to trim two chars (\r\n) from the query buffer, trimming the '' from *3\r\n$8\r\nREPLCONF\r\n... then the master tries to process a command starting with '3' and replies to the slave a bunch of -ERR and one +OK. although the slave silently ignores these (prints a log message), this corrupts the replication offset at the slave since the slave increases the replication offset, and the master did not. other than the fix in processInlineBuffer, i did several other improvments while hunting this very rare bug. - when redis replies with "unknown command" it includes a portion of the arguments, not just the command name. so it would be easier to understand what was recived, in my case, on the slave side, it was -ERR, but the "arguments" were the interesting part (containing info on the error). - about a year ago i added code in addReplyErrorLength to print the error to the log in case of a reply to master (since this string isn't actually trasmitted to the master), now changed that block to print a similar log message to indicate an error being sent from the master to the slave. note that the slave is marked as CLIENT_SLAVE only after PSYNC was received, so this will not cause any harm for REPLCONF, and will only indicate problems that are gonna corrupt the replication stream anyway. - two places were c->reply was emptied, and i wanted to reset sentlen this is a precaution (i did not actually see such a problem), since a non-zero sentlen will cause corruption to be transmitted on the socket.	2018-07-23 18:16:22 +02:00
antirez	70597a3011	Cluster: ability to prevent slaves from failing over their masters. This commit, in some parts derived from PR #3041 which is no longer possible to merge (because the user deleted the original branch), implements the ability of slaves to have a special configuration preventing that they try to start a failover when the master is failing. There are multiple reasons for wanting this, and the feautre was requested in issue #3021 time ago. The differences between this patch and the original PR are the following: 1. The flag is saved/loaded on the nodes configuration. 2. The 'myself' node is now flag-aware, the flag is updated as needed when the configuration is changed via CONFIG SET. 3. The flag name uses NOFAILOVER instead of NO_FAILOVER to be consistent with existing NOADDR. 4. The redis.conf documentation was rewritten. Thanks to @deep011 for the original patch.	2018-03-14 16:31:46 +01:00
antirez	83923afa8c	Track number of logically expired keys still in memory. This commit adds two new fields in the INFO output, stats section: expired_stale_perc:0.34 expired_time_cap_reached_count:58 The first field is an estimate of the number of keys that are yet in memory but are already logically expired. They reason why those keys are yet not reclaimed is because the active expire cycle can't spend more time on the process of reclaiming the keys, and at the same time nobody is accessing such keys. However as the active expire cycle runs, while it will eventually have to return to the caller, because of time limit or because there are less than 25% of keys logically expired in each given database, it collects the stats in order to populate this INFO field. Note that expired_stale_perc is a running average, where the current sample accounts for 5% and the history for 95%, so you'll see it changing smoothly over time. The other field, expired_time_cap_reached_count, counts the number of times the expire cycle had to stop, even if still it was finding a sizeable number of keys yet to expire, because of the time limit. This allows people handling operations to understand if the Redis server, during mass-expiration events, is able to collect keys fast enough usually. It is normal for this field to increment during mass expires, but normally it should very rarely increment. When instead it constantly increments, it means that the current workloads is using a very important percentage of CPU time to expire keys. This feature was created thanks to the hints of Rashmi Ramesh and Bart Robinson from Twitter. In private email exchanges, they noted how it was important to improve the observability of this parameter in the Redis server. Actually in big deployments, the amount of keys that are yet to expire in each server, even if they are logically expired, may account for a very big amount of wasted memory.	2018-02-19 11:22:34 +01:00
Yusaku Kaneta	5f9b9e1194	Fix the firstkey, lastkey, and keystep of moduleCommand	2018-01-24 10:58:39 +01:00
antirez	48568ab6d7	Remove useless comment from serverCron(). The behavior is well specified by the code itself.	2018-01-18 12:16:42 +01:00
heqin	0201dea577	fixbug for #4545 dead loop aof rewrite	2018-01-18 12:16:37 +01:00
antirez	87fe813b3a	New config options about protocol prefixed with "proto". Related to #4568.	2018-01-18 12:14:53 +01:00
Oran Agra	ff2e628f4e	Add config options for max-bulk-len and max-querybuf-len mainly to support RESTORE of large keys	2018-01-18 12:13:50 +01:00
heqin	896cf1a9d9	fixbug for #4545 dead loop aof rewrite	2018-01-18 12:12:58 +01:00
antirez	ab3d3aca48	Prevent corruption of server.executable after DEBUG RESTART. Doing the following ended with a broken server.executable: 1. Start Redis with src/redis-server 2. Send CONFIG SET DIR /tmp/ 3. Send DEBUG RESTART At this point we called execve with an argv[0] that is no longer related to the new path. So after the restart the absolute path of the executable is recomputed in the wrong way. With this fix we pass the absolute path already computed as argv[0].	2017-11-30 18:40:20 +01:00
antirez	b7c7edf912	Be more verbose when DEBUG RESTART fails.	2017-11-30 18:40:17 +01:00
antirez	6b6a83c7ab	Fix entry command table entry for OBJECT for HELP option. After #4472 the command may have just 2 arguments.	2017-11-28 17:34:47 +01:00
zhaozhao.zz	3594238350	PSYNC2: clarify the scenario when repl_stream_db can be -1	2017-11-24 11:09:28 +01:00
antirez	9a3e15c6a2	Modules: fix for scripting replication of modules commands. See issue #4466 / #4467.	2017-11-23 15:15:50 +01:00
zhaozhao.zz	bc7076b090	rehash: handle one db until finished	2017-11-21 09:50:08 +01:00
antirez	097a555677	PSYNC2: Fix the way replication info is saved/loaded from RDB. This commit attempts to fix a number of bugs reported in #4316. They are related to the way replication info like replication ID, offsets, and currently selected DB in the master client, are stored and loaded by Redis. In order to avoid inconsistencies the changes in this commit try to enforce that: 1. Replication information are only stored when the RDB file is generated by a slave that has a valid 'master' client, so that we can always extract the currently selected DB. 2. When replication informations are persisted in the RDB file, all the info for a successful PSYNC or nothing is persisted. 3. The RDB replication informations are only loaded if the instance is configured as a slave, otherwise a master can start with IDs that relate to a different history of the data set, and stil retain such IDs in the future while receiving unrelated writes.	2017-09-20 11:47:23 +02:00
Oran Agra	8651e5d50d	Flush append only buffers before existing. when SHUTDOWN command is recived it is possible that some of the recent command were not yet flushed from the AOF buffer, and the server experiences data loss at shutdown.	2017-09-18 12:04:31 +02:00
Itamar Haber	363be78397	Changes command stats iteration to being dict-based With the addition of modules, looping over the redisCommandTable misses any added commands. By moving to dictionary iteration this is resolved.	2017-08-02 12:52:00 +02:00
antirez	cfdcd440d7	AOF check utility: ability to check files with RDB preamble.	2017-07-14 10:55:17 +02:00
antirez	4ebfe2653c	Avoid closing invalid FDs to make Valgrind happier.	2017-07-06 16:10:07 +02:00
antirez	64db804415	HMSET and MSET implementations unified. HSET now variadic. This is the first step towards getting rid of HMSET which is a command that does not make much sense once HSET is variadic, and has a saner return value.	2017-07-06 16:09:36 +02:00
antirez	bdd6de963d	Added GEORADIUS(BYMEMBER)_RO variants for read-only operations. Issue #4084 shows how for a design error, GEORADIUS is a write command because of the STORE option. Because of this it does not work on readonly slaves, gets redirected to masters in Redis Cluster even when the connection is in READONLY mode and so forth. To break backward compatibility at this stage, with Redis 4.0 to be in advanced RC state, is problematic for the user base. The API can be fixed into the unstable branch soon if we'll decide to do so in order to be more consistent, and reease Redis 5.0 with this incompatibility in the future. This is still unclear. However, the ability to scale GEO queries in slaves easily is too important so this commit adds two read-only variants to the GEORADIUS and GEORADIUSBYMEMBER command: GEORADIUS_RO and GEORADIUSBYMEMBER_RO. The commands are exactly as the original commands, but they do not accept the STORE and STOREDIST options.	2017-06-30 11:53:43 +02:00
Suraj Narkhede	de391ff11c	Fix brpop command table entry and redirect blocked clients.	2017-06-30 10:13:59 +02:00
xuzhou	63e1c9f224	Fix set with ex/px option when propagated to aof	2017-06-22 11:01:27 +02:00
antirez	0b7ba621b7	SLOWLOG: log offending client address and name.	2017-06-15 17:02:16 +02:00
Qu Chen	73d358f79f	Implement getKeys procedure for georadius and georadiusbymember commands.	2017-06-14 18:16:24 +02:00
antirez	cb548bf3c0	More informative -MISCONF error message.	2017-05-19 12:04:11 +02:00
antirez	6b21cebd3d	Modules TSC: use atomic var for server.unixtime. This avoids Helgrind complaining, but we are actually not using atomicGet() to get the unixtime value for now: too many places where it is used and given tha time_t is word-sized it should be safe in all the archs we support as it is. On the other hand, Helgrind, when Redis is compiled with "make helgrind" in order to force the __sync macros, will detect the write in updateCachedTime() as a read (because atomic functions are used) and will not complain about races. This commit also includes minor refactoring of mutex initializations and a "helgrind" target in the Makefile.	2017-05-11 16:44:46 +02:00
antirez	54bd224f0e	atomicvar.h: show used API in INFO. Add macro to force __sync builtin. The __sync builtin can be correctly detected by Helgrind so to force it is useful for testing. The API in the INFO output can be useful for debugging after problems are reported.	2017-05-11 16:44:46 +02:00
antirez	a864d25c6e	zmalloc.c: remove thread safe mode, it's the default way.	2017-05-11 16:44:46 +02:00
antirez	b338f2b908	Modules TSC: Add mutex for server.lruclock. Only useful for when no atomic builtins are available.	2017-05-11 16:44:46 +02:00
antirez	7e9c658d13	Modules TSC: Improve inter-thread synchronization. More work to do with server.unixtime and similar. Need to write Helgrind suppression file in order to suppress the valse positives.	2017-05-11 16:44:46 +02:00
antirez	c4b884958e	Modules TSC: Release the GIL for all the time we are blocked. Instead of giving the module background operations just a small time to run in the beforeSleep() function, we can have the lock released for all the time we are blocked in the multiplexing syscall.	2017-05-11 16:44:26 +02:00
antirez	74f3a84390	Modules TSC: GIL and cooperative multi tasking setup.	2017-05-11 16:44:03 +02:00
antirez	b506eb74ac	Check event loop creation return value. Fix #3951 . Normally we never check for OOM conditions inside Redis since the allocator will always return a pointer or abort the program on OOM conditons. However we cannot have control on epool_create(), that may fail for kernel OOM (according to the manual page) even if all the parameters are correct, so the function aeCreateEventLoop() may indeed return NULL and this condition must be checked.	2017-04-28 10:40:44 +02:00
antirez	6e1489ae50	Set lua-time-limit default value at safe place. Otherwise, as it was, it will overwrite whatever the user set. Close #3703.	2017-04-18 16:17:11 +02:00
antirez	f60d6f09ce	Fix modules blocking commands awake delay. If a thread unblocks a client blocked in a module command, by using the RedisMdoule_UnblockClient() API, the event loop may not be awaken until the next timeout of the multiplexing API or the next unrelated I/O operation on other clients. We actually want the client to be served ASAP, so a mechanism is needed in order for the unblocking API to inform Redis that there is a client to serve ASAP. This commit fixes the issue using the old trick of the pipe: when a client needs to be unblocked, a byte is written in a pipe. When we run the list of clients blocked in modules, we consume all the bytes written in the pipe. Writes and reads are performed inside the context of the mutex, so no race is possible in which we consume the bytes that are actually related to an awake request for a client that should still be put into the list of clients to unblock. It was verified that after the fix the server handles the blocked clients with the expected short delay. Thanks to @dvirsky for understanding there was such a problem and reporting it.	2017-04-18 16:16:27 +02:00
antirez	ba647598b4	Use SipHash hash function to mitigate HashDos attempts. This change attempts to switch to an hash function which mitigates the effects of the HashDoS attack (denial of service attack trying to force data structures to worst case behavior) while at the same time providing Redis with an hash function that does not expect the input data to be word aligned, a condition no longer true now that sds.c strings have a varialbe length header. Note that it is possible sometimes that even using an hash function for which collisions cannot be generated without knowing the seed, special implementation details or the exposure of the seed in an indirect way (for example the ability to add elements to a Set and check the return in which Redis returns them with SMEMBERS) may make the attacker's life simpler in the process of trying to guess the correct seed, however the next step would be to switch to a log(N) data structure when too many items in a single bucket are detected: this seems like an overkill in the case of Redis. SPEED REGRESION TESTS: In order to verify that switching from MurmurHash to SipHash had no impact on speed, a set of benchmarks involving fast insertion of 5 million of keys were performed. The result shows Redis with SipHash in high pipelining conditions to be about 4% slower compared to using the previous hash function. However this could partially be related to the fact that the current implementation does not attempt to hash whole words at a time but reads single bytes, in order to have an output which is endian-netural and at the same time working on systems where unaligned memory accesses are a problem. Further X86 specific optimizations should be tested, the function may easily get at the same level of MurMurHash2 if a few optimizations are performed.	2017-02-21 17:16:35 +01:00
oranagra	67def2611f	active memory defragmentation	2017-01-12 09:59:30 +01:00
antirez	a0e9511894	Remove first version of ASCII wave, later discarded.	2016-12-20 13:33:46 +01:00
antirez	3334a409b3	Only show Redis logo if logging to stdout / TTY. You can still force the logo in the normal logs. For motivations, check issue #3112. For me the reason is that actually the logo is nice to have in interactive sessions, but inside the logs kinda loses its usefulness, but for the ability of users to recognize restarts easily: for this reason the new startup sequence shows a one liner ASCII "wave" so that there is still a bit of visual clue. Startup logging was modified in order to log events in more obvious ways, and to log more events. Also certain important informations are now more easy to parse/grep since they are printed in field=value style. The option --always-show-logo in redis.conf was added, defaulting to no.	2016-12-20 13:33:45 +01:00
antirez	db53c23037	adjustOpenFilesLimit() comment made hopefully more clear.	2016-12-19 08:54:46 +01:00
oranagra	7f870fadc2	fix unsigned int overflow in adjustOpenFilesLimit	2016-12-19 08:54:46 +01:00
antirez	2e375d4f42	Switch PFCOUNT to LogLog-Beta algorithm. The new algorithm provides the same speed with a smaller error for cardinalities in the range 0-100k. Before switching, the new and old algorithm behavior was studied in details in the context of issue #3677. You can find a few graphs and motivations there.	2016-12-16 11:07:42 +01:00
Harish Murthy	4d475e0f88	LogLog-Beta Algorithm support within HLL Config option to use LogLog-Beta Algorithm for Cardinality	2016-12-16 11:07:42 +01:00
antirez	746d70b015	INFO: show num of slave-expires keys tracked.	2016-12-13 18:34:04 +01:00
antirez	c65dfb436e	Replication: fix the infamous key leakage of writable slaves + EXPIRE. BACKGROUND AND USE CASEj Redis slaves are normally write only, however the supprot a "writable" mode which is very handy when scaling reads on slaves, that actually need write operations in order to access data. For instance imagine having slaves replicating certain Sets keys from the master. When accessing the data on the slave, we want to peform intersections between such Sets values. However we don't want to intersect each time: to cache the intersection for some time often is a good idea. To do so, it is possible to setup a slave as a writable slave, and perform the intersection on the slave side, perhaps setting a TTL on the resulting key so that it will expire after some time. THE BUG Problem: in order to have a consistent replication, expiring of keys in Redis replication is up to the master, that synthesize DEL operations to send in the replication stream. However slaves logically expire keys by hiding them from read attempts from clients so that if the master did not promptly sent a DEL, the client still see logically expired keys as non existing. Because slaves don't actively expire keys by actually evicting them but just masking from the POV of read operations, if a key is created in a writable slave, and an expire is set, the key will be leaked forever: 1. No DEL will be received from the master, which does not know about such a key at all. 2. No eviction will be performed by the slave, since it needs to disable eviction because it's up to masters, otherwise consistency of data is lost. THE FIX In order to fix the problem, the slave should be able to tag keys that were created in the slave side and have an expire set in some way. My solution involved using an unique additional dictionary created by the writable slave only if needed. The dictionary is obviously keyed by the key name that we need to track: all the keys that are set with an expire directly by a client writing to the slave are tracked. The value in the dictionary is a bitmap of all the DBs where such a key name need to be tracked, so that we can use a single dictionary to track keys in all the DBs used by the slave (actually this limits the solution to the first 64 DBs, but the default with Redis is to use 16 DBs). This solution allows to pay both a small complexity and CPU penalty, which is zero when the feature is not used, actually. The slave-side eviction is encapsulated in code which is not coupled with the rest of the Redis core, if not for the hook to track the keys. TODO I'm doing the first smoke tests to see if the feature works as expected: so far so good. Unit tests should be added before merging into the 4.0 branch.	2016-12-13 18:34:04 +01:00
antirez	5b7d42fff3	PSYNC2: bugfixing pre release. 1. Master replication offset was cleared after switching configuration to some other slave, since it was assumed you can't PSYNC after a switch. Note the case anymore and when we successfully PSYNC we need to have our offset untouched. 2. Secondary replication ID was not reset to "000..." pattern at startup. 3. Master in error state replying -LOADING or other transient errors forced the slave to discard the cached master and full resync. This is now fixed. 4. Better logging of what's happening on failed PSYNCs.	2016-11-23 17:36:45 +01:00
antirez	28c96d73b2	PSYNC2: Save replication ID/offset on RDB file. This means that stopping a slave and restarting it will still make it able to PSYNC with the master. Moreover the master itself will retain its ID/offset, in case it gets turned into a slave, or if a slave will try to PSYNC with it with an exactly updated offset (otherwise there is no backlog). This change was possible thanks to PSYNC v2 that makes saving the current replication state much simpler.	2016-11-10 12:35:29 +01:00

1 2 3

150 Commits