redis

mirror of https://github.com/fluencelabs/redis synced 2025-06-19 12:11:21 +00:00

Author	SHA1	Message	Date
Matt Stancliff	80dec5e4df	Sentinel: Notify user when config can't be saved	2014-03-25 08:03:07 +01:00
Jan-Erik Rediger	a2ec9a9068	Small typo fixed	2014-03-25 08:02:56 +01:00
Matt Stancliff	88c6c669c3	Fix data loss when save AOF/RDB with no free space Previously, the (!fp) would only catch lack of free space under OS X. Linux waits to discover it can't write until it actually writes contents to disk. (fwrite() returns success even if the underlying file has no free space to write into. All the errors only show up at flush/sync/close time.) Fixes antirez/redis#1604	2014-03-24 21:56:22 +01:00
Jan-Erik Rediger	6182dad37c	Finally fix the `install_server.sh` script. Includes changes from a dozen bug reports and pull requests. Was tested on Ubuntu, Debian and CentOS.	2014-03-24 18:12:58 +01:00
antirez	4ebc7e3738	Sample and cache RSS in serverCron(). Obtaining the RSS (Resident Set Size) info is slow in Linux and OSX. This slowed down the generation of the INFO 'memory' section. Since the RSS does not require to be a real-time measurement, we now sample it with server.hz frequency (10 times per second by default) and use this value both to show the INFO rss field and to compute the fragmentation ratio. Practically this does not make any difference for memory profiling of Redis but speeds up the INFO call significantly.	2014-03-24 12:03:44 +01:00
antirez	72257f4b87	sdscatvprintf(): Try to use a static buffer. For small content the function now tries to use a static buffer to avoid a malloc/free cycle that is too costly when the function is used in the context of performance critical code path such as INFO output generation. This change was verified to have positive effects in the execution speed of the INFO command.	2014-03-24 10:23:58 +01:00
antirez	7054f2a029	Cache uname() output across INFO calls. Uname was profiled to be a slow syscall. It produces always the same output in the context of a single execution of Redis, so calling it at every INFO output generation does not make too much sense. The uname utsname structure was modified as a static variable. At the same time a static integer was added to check if we need to call uname the first time.	2014-03-24 10:02:58 +01:00
antirez	7a20f09647	sdscatvprintf(): guess buflen using format length. sdscatvprintf() uses a loop where it tries to output the formatted string in a buffer of the initial length, if there was not enough room, a buffer of doubled size is tried and so forth. The initial guess for the buffer length was very poor, an hardcoded "16". This caused the printf to be processed multiple times without a good reason. Given that printf functions are already not fast, the overhead was significant. The new heuristic is to use a buffer 4 times the length of the format buffer, and 32 as minimal size. This appears to be a good balance for typical uses of the function inside the Redis code base. This change improved INFO command performances 3 times.	2014-03-24 09:45:18 +01:00
antirez	001775f5fb	Use 24 bits for the lru object field and improve resolution. There were 2 spare bits inside the Redis object structure that are now used in order to enlarge 4x the range of the LRU field. At the same time the resolution was improved from 10 to 1 second: this still provides 194 days before the LRU counter overflows (restarting from zero). This is not a problem since it only causes lack of eviction precision for objects not touched for a very long time, and the lack of precision is only temporary.	2014-03-21 11:19:28 +01:00
antirez	c68189a19f	Specify lruclock in redisServer structure via REDIS_LRU_BITS. The padding field was totally useless: removed.	2014-03-21 11:17:52 +01:00
antirez	ff8c81873a	Set LRU parameters via REDIS_LRU_BITS define.	2014-03-21 11:17:36 +01:00
antirez	e3b71a1c02	Unify stats reset for CONFIG RESETSTAT / initServer(). Now CONFIG RESETSTAT makes sure to reset all the fields, and in the future it will be simpler to avoid missing new fields.	2014-03-21 11:16:12 +01:00
antirez	0937377aa2	Sentinel: sentinelRefreshInstanceInfo() minor refactoring. Test sentinel.tilt condition on top and return if it is true. This allows to remove the check for the tilt condition in the remaining code paths of the function.	2014-03-21 11:16:12 +01:00
antirez	686839b477	Sentinel test: 02 unit better coverage + refactoring.	2014-03-21 11:16:12 +01:00
antirez	6d0e408a27	Sentinel test: foreach_instance_id implements 'break'.	2014-03-21 11:16:12 +01:00
antirez	ba2edc4191	Sentinel: instance_is_killed proc added to sentinel.tcl.	2014-03-21 11:16:12 +01:00
antirez	9c2063fb8f	Sentinel: propagate down-after-ms changes to slaves and sentinels.	2014-03-21 11:16:12 +01:00
antirez	ffa8f479a5	Sentinel: down-after-milliseconds is not master-specific. addReplySentinelRedisInstance() modified so that this field is displayed for all the kind of instances: Sentinels, Masters, Slaves.	2014-03-21 11:16:12 +01:00
antirez	42091a79bb	Sentinel failure detection implementation improved. Failure detection in Sentinel is ping-pong based. It used to work by remembering the last time a valid PONG reply was received, and checking if the reception time was too old compared to the current current time. PINGs were sent at a fixed interval of 1 second. This works in a decent way, but does not scale well when we want to set very small values of "down-after-milliseconds" (this is the node timeout basically). This commit reiplements the failure detection making a number of changes. Some changes are inspired to Redis Cluster failure detection code: * A new last_ping_time field is added in representation of instances. If non zero, we have an active ping that was sent at the specified time. When a valid reply to ping is received, the field is zeroed again. * last_ping_time is not reset when we reconnect the link or send a new ping, so from our point of view it represents the time we started waiting for the instance to reply to our pings without receiving a reply. * last_ping_time is now used in order to check if the instance is timed out. This means that we can have a node timeout of 100 milliseconds and yet the system will work well since the new check is not bound to the period used to send pings. * Pings are now sent every second, or often if the value of down-after-milliseconds is less than one second. With a lower limit of 10 HZ ping frequency. * Link reconnection code was improved. This is used in order to try to reconnect the link when we are at 50% of the node timeout without a valid reply received yet. However the old code triggered unnecessary reconnections when the node timeout was very small. Now that should be ok. The new code passes the tests but more testing is needed and more unit tests stressing the failure detector, so currently this is merged only in the unstable branch.	2014-03-21 11:16:11 +01:00
antirez	38241c4b1e	Sentinel: use CLIENT SETNAME when connecting to Redis. This makes debugging / monitoring of Sentinels simpler since you can identify sentinels in CLIENT LIST output of Redis instances.	2014-03-21 11:16:11 +01:00
Matt Stancliff	9de0755869	Fix segfault from accessing array out of bounds argc == 2; argv[2] == crash	2014-03-14 22:57:10 +01:00
antirez	a31a0b4336	Sentinel: be safe under crash-recovery assumptions. Sentinel's main safety argument is that there are no two configurations for the same master with the same version (configuration epoch). For this to be true Sentinels require to be authorized by a majority. Additionally Sentinels require to do two important things: * Never vote again for the same epoch. * Never exchange an old vote for a fresh one. The first prerequisite, in a crash-recovery system model, requires to persist the master->leader_epoch on durable storage before to reply to messages. This was not the case. We also make sure to persist the current epoch in order to never reply to stale votes requests from other Sentinels, after a recovery. The configuration is persisted by making use of fsync(), this is considered in the context of this code a good enough guarantee that after a restart our durable state is restored, however this may not always be the case depending on the kind of hardware and operating system used.	2014-03-14 22:57:06 +01:00
antirez	6b0e36ffc6	Sentinel: fake PUBLISH command to receive HELLO messages. Now the way HELLO messages are received is unified. Now it is no longer needed for Sentinels to converge to the higher configuration for a master to be able to chat via some Redis instance, the are able to directly exchanges configurations. Note that this commit does not include the (trivial) change needed to send HELLO messages to Sentinel instances as well, since for an error I committed the change in the previous commit that refactored hello messages processing into a separated function.	2014-03-14 11:04:54 +01:00
antirez	bd48ff69b0	Sentinel: HELLO processing refactored into sentinelProcessHelloMessage().	2014-03-14 10:56:56 +01:00
antirez	3703112671	Linenoise updated, multiline mode enabled in redis-cli.	2014-03-13 15:12:04 +01:00
zhanghailei	ccd2c18c1c	According to context,the size should be 16 rather than 64	2014-03-11 10:12:19 +01:00
zhanghailei	daa5d2a6c9	FIXED a typo more thank should be more than	2014-03-11 10:12:13 +01:00
zhanghailei	ca0720b694	refer to updateLRUClock's comment REDIS_LRU_CLOCK_MAX is 22 bits,but #define REDIS_LRU_CLOCK_MAX ((1<<21)-1) only 21 bits	2014-03-11 10:12:00 +01:00
Matt Stancliff	ffe742f92a	Reset op_sec_last_sample_ops when reset requested This value needs to be set to zero (in addition to stat_numcommands) or else people may see a negative operations per second count after they run CONFIG RESETSTAT. Fixes antirez/redis#1577	2014-03-11 10:10:54 +01:00
antirez	6fb9e2d717	Typo in sentinel.conf, exists -> exits.	2014-03-11 10:10:30 +01:00
antirez	eb9e1526e6	DEBUG ERROR implemented. The new "error" subcommand of the DEBUG command can reply with an user selected error, specified as its sole argument: DEBUG ERROR "LOADING please wait..." The error is generated just prefixing the command argument with a "-" character, and replacing newlines with spaces (since error replies can't include newlines). The goal of the command is to help in Client libraries unit tests by making simple to simulate a command call triggering a given error.	2014-03-10 23:04:37 +01:00
Matt Stancliff	a6970570e3	Fix return value check for anetTcpAccept anetTcpAccept returns ANET_ERR, not AE_ERR. This isn't a physical error since both ANET_ERR and AE_ERR are -1, but better to be consistent.	2014-03-10 15:47:20 +01:00
antirez	fe0ab7d234	Fixed memory leak in SORT LIMIT option argument parsing on error.	2014-03-10 15:45:29 +01:00
antirez	464fef9bf8	Redis 2.8.7. 2.8.7	2014-03-05 14:42:50 +01:00
antirez	aab16ead92	Cast saveparams[].seconds to long for %ld format specifier.	2014-03-05 11:26:50 +01:00
antirez	86afc0706d	Merge branch '2.8' of github.com:/antirez/redis into 2.8	2014-03-05 10:22:56 +01:00
antirez	55b8f6ec1c	Document why we update peak memory in INFO.	2014-03-05 10:16:20 +01:00
Matt Stancliff	647a261465	Force INFO used_memory_peak to match peak memory used_memory_peak only updates in serverCron every server.hz, but Redis can use more memory and a user can request memory INFO before used_memory_peak gets updated in the next cron run. This patch updates used_memory_peak to the current memory usage if the current memory usage is higher than the recorded used_memory_peak value. (And it only calls zmalloc_used_memory() once instead of twice as it was doing before.)	2014-03-05 10:16:16 +01:00
antirez	9c66cd91be	Sentinel test: Makefile target added.	2014-03-05 10:16:12 +01:00
michael-grunder	6991792943	Improved bigkeys with progress, pipelining and summary This commit reworks the redis-cli --bigkeys command to provide more information about our progress as well as output summary information when we're done. - We now show an approximate percentage completion as we go - Hiredis pipelining is used for TYPE and SIZE retreival - A summary of keyspace distribution and overall breakout at the end	2014-03-05 10:16:06 +01:00
antirez	7d65b7199a	BITPOS fuzzy testing.	2014-03-05 10:16:02 +01:00
antirez	c19cfde65d	Basic BITPOS tests.	2014-03-05 10:15:55 +01:00
antirez	55f4b20f31	Sentinel test: set less time sensitive defaults. This commit sets the failover timeout to 30 seconds instead of the 180 seconds default, and allows to reconfigure multiple slaves at the same time. This makes tests less sensible to timing, with the result that there are less false positives due to normal behaviors that require time to succeed or to be retried. However the long term solution is probably some way in order to detect when a test failed because of timing issues (for example split brain during leader election) and retry it.	2014-03-05 10:15:32 +01:00
antirez	1606978a0b	Sentinel: more aggressive failover start desynchronization. Sentinel needs to avoid split brain conditions due to multiple sentinels trying to get voted at the exact same time. So far some desynchronization was provided by fluctuating server.hz, that is the frequency of the timer function call. However the desynchonization provided in this way was not enough when using many Sentinel instances, especially when a large quorum value is used in order to force a greater degree of agreement (more than N/2+1). It was verified that it was likely to trigger a split brain condition, forcing the system to try again after a timeout. Usually the system will succeed after a few retries, but this is not optimal. This commit desynchronizes instances in a more effective way to make it likely that the first attempt will be successful.	2014-03-05 10:15:32 +01:00
antirez	6200191943	CONFIG REWRITE should be logged at WARNING level.	2014-03-05 10:15:32 +01:00
antirez	2af2173a60	Sentinel test: debugging console improved.	2014-03-05 10:15:32 +01:00
antirez	1dc1e31c2b	Sentinel test: initial debugging console.	2014-03-05 10:15:32 +01:00
antirez	b132121a71	Sentinel test: be more patient in create_redis_master_slave_cluster.	2014-03-05 10:15:32 +01:00
antirez	d1d706c923	Sentiel test: add test start time in output.	2014-03-05 10:15:32 +01:00
antirez	c99cd2fd40	Sentinel test: use 1000 as retry in initial 00 unit test.	2014-03-05 10:15:32 +01:00

... 3 4 5 6 7 ...

3808 Commits