redis

mirror of https://github.com/fluencelabs/redis synced 2025-05-01 21:42:13 +00:00

Author	SHA1	Message	Date
antirez	8c551d65a1	Sentinel: make sure role_reported is always updated.	2013-11-21 15:20:15 +01:00
antirez	9577fed8a3	Sentinel: track role change time. Wait before reconfigurations.	2013-11-21 15:20:11 +01:00
antirez	fc10fb17da	Sentinel: fix no-down check in master->slave conversion code.	2013-11-21 15:20:07 +01:00
antirez	b02ef3d59a	Sentinel: readd slaves back after a master reset.	2013-11-21 15:19:56 +01:00
antirez	1a1dc3de38	Sentinel: sentinelResetMaster() new flag to avoid removing set of sentinels. This commit also removes some dead code and cleanup generic flags.	2013-11-21 15:19:52 +01:00
antirez	9f0e52a13d	Sentinel: receive Pub/Sub messages from slaves.	2013-11-21 15:19:49 +01:00
antirez	f7604b4c07	Sentinel: change event name when converting master to slave.	2013-11-21 15:19:45 +01:00
antirez	1569326277	Sentinel: added config-epoch to SENTINEL masters output.	2013-11-21 15:19:41 +01:00
antirez	9f20780de6	Sentinel: new failover algo, desync slaves and update config epoch.	2013-11-21 15:19:34 +01:00
antirez	2488257ad8	Sentinel: when starting failover seek for votes ASAP.	2013-11-21 15:19:30 +01:00
antirez	6593f33222	Sentinel: +new-epoch events.	2013-11-21 15:19:26 +01:00
antirez	48acc675dd	Sentinel: wait some time between failover attempts.	2013-11-21 15:19:22 +01:00
antirez	447c2787a0	Sentinel: allow to vote for myself.	2013-11-21 15:19:16 +01:00
antirez	eba4775b5d	Sentinel: fix PUBLISH to masters and slaves.	2013-11-21 15:19:11 +01:00
antirez	b95c6ed7b7	Sentinel: epoch introduced in leader vote.	2013-11-21 15:18:52 +01:00
antirez	663d79c0d5	Sentinel: leadership handling changes WIP. Changes to leadership handling. Now the leader gets selected by every Sentinel, for a specified epoch, when the SENTINEL is-master-down-by-addr is sent. This command now includes the runid and the currentEpoch of the instance seeking for a vote. The Sentinel only votes a single time in a given epoch. Still a work in progress, does not even compile at this stage.	2013-11-21 15:18:45 +01:00
antirez	b72985d7f7	Sentinel: handle Hello messages received via slaves correctly. Even when messages are received via the slave, we should perform operations (like adding a new Sentinel) in the context of the master.	2013-11-21 15:18:41 +01:00
antirez	fe7f96f18c	Sentinel: remove code not useful in the new design.	2013-11-21 15:18:36 +01:00
antirez	be2ef1b59f	Sentinel: epoch introduced. Sentinel state now includes the idea of current epoch and config epoch. In the Hello message, that is now published both on masters and slaves, a Sentinel no longer just advertises itself but also broadcasts its current view of the configuration: the master name / ip / port and its current epoch. Sentinels receiving such information switch to the new master if the configuration epoch received is newer and the ip / port of the master are indeed different compared to the previos ones.	2013-11-21 15:18:31 +01:00
antirez	c874e39c45	Sentinel: sentinelSendSlaveOf() was missing a var and the prototype.	2013-11-06 11:29:57 +01:00
antirez	1767998751	Sentinel: increment pending_commands counter in two more places. AUTH and SCRIPT KILL were sent without incrementing the pending commands counter. Clearly this needs some kind of wrapper doing it for the caller in order to be less bug prone.	2013-11-06 11:29:53 +01:00
antirez	97810c45e8	Sentinel: always send CONFIG REWRITE when changing instance role. This change makes Sentinel less fragile about a number of failure modes. This commit also fixes a different bug as a side effect, SLAVEOF command was sent multiple times without incrementing the pending commands count.	2013-11-06 11:29:49 +01:00
antirez	f899ab55ca	sdsrange() does not need to return a value. Actaully the string is modified in-place and a reallocation is never needed, so there is no need to return the new sds string pointer as return value of the function, that is now just "void".	2013-07-24 11:22:52 +02:00
antirez	1e23848ed3	Sentinel: embed IPv6 address into [] when naming slave/sentinel instance.	2013-07-11 17:10:09 +02:00
antirez	076f6395b9	Sentinel: use comma as separator to publish hello messages. We use comma to play well with IPv6 addresses, but the implementation is still able to parse the old messages separated by colons.	2013-07-11 17:09:44 +02:00
antirez	98d0abcecd	Sentinel: make sure published addr/id buffer is large enough. With ipv6 support we need more space, so we account for the IP address max size plus what we need for the Run ID, port, flags.	2013-07-11 17:09:30 +02:00
antirez	a7451c1b6d	All IP string repr buffers are now REDIS_IP_STR_LEN bytes.	2013-07-11 17:07:52 +02:00
Geoff Garside	0d8f254359	Add IPv6 support to sentinel.c. This has been done by exposing the anetSockName() function anet.c to be used when the sentinel is publishing its existence to the masters. This implementation is very unintelligent as it will likely break if used with IPv6 as the nested colons will break any parsing of the PUBLISH string by the master.	2013-07-11 17:07:31 +02:00
Geoff Garside	4b2e374e4a	Update calls to anetResolve to include buffer size	2013-07-11 17:05:08 +02:00
antirez	9af8125c7d	Sentinel: parse new INFO replication output correctly. Sentinel was not able to detect slaves when connected to a very recent version of Redis master since a previos non-backward compatible change to INFO broken the parsing of the slaves ip:port INFO output. This fixes issue #1164	2013-06-20 10:24:31 +02:00
antirez	e7bcec829c	Sentinel: changes to tilt mode. Tilt mode was too aggressive (not processing INFO output), this resulted in a few problems: 1) Redirections were not followed when in tilt mode. This opened a window to misinform clients about the current master when a Sentinel was in tilt mode and a fail over happened during the time it was not able to update the state. 2) It was possible for a Sentinel exiting tilt mode to detect a false fail over start, if a slave rebooted with a wrong configuration about at the same time. This used to happen since in tilt mode we lose the information that the runid changed (reboot). Now instead the Sentinel in tilt mode will still remove the instance from the list of slaves if it changes state AND runid at the same time. Both are edge conditions but the changes should overall improve the reliability of Sentinel.	2013-04-30 15:09:14 +02:00
antirez	4028a777b6	Sentinel: more sensible delay in master demote after tilt.	2013-04-30 15:09:10 +02:00
antirez	70845320cc	Sentinel: only demote old master into slave under certain conditions. We used to always turn a master into a slave if the DEMOTE flag was set, as this was a resurrecting master instance. However the following race condition is possible for a Sentinel that got partitioned or internal issues (tilt mode), and was not able to refresh the state in the meantime: 1) Sentinel X is running, master is instance "A". 3) "A" fails, sentinels will promote slave "B" as master. 2) Sentinel X goes down because of a network partition. 4) "A" returns available, Sentinels will demote it as a slave. 5) "B" fails, other Sentinels will promote slave "A" as master. 6) At this point Sentinel X comes back. When "X" comes back he thinks that: "B" is the master. "A" is the slave to demote. We want to avoid that Sentinel "X" will demote "A" into a slave. We also want that Sentinel "X" will detect that the conditions changed and will reconfigure itself to monitor the right master. There are two main ways for the Sentinel to reconfigure itself after this event: 1) If "B" is reachable and already configured as a slave by other sentinels, "X" will perform a redirection to "A". 2) If there are not the conditions to demote "A", the fact that "A" reports to be a master will trigger a failover detection in "X", that will end into a reconfiguraiton to monitor "A". However if the Sentinel was not reachable, its state may not be updated, so in case it titled, or was partiitoned from the master instance of the slave to demote, the new implementation waits some time (enough to guarantee we can detect the new INFO, and new DOWN conditions). If after some time still there are not the right condiitons to demote the instance, the DEMOTE flag is cleared.	2013-04-30 15:09:06 +02:00
antirez	d2ff5ed603	Sentinel: always redirect on master->slave transition. Sentinel redirected to the master if the instance changed runid or it was the first time we got INFO, and a role change was detected from master to slave. While this is a good idea in case of slave->master, since otherwise we could detect a failover without good reasons just after a reboot with a slave with a wrong configuration, in the case of master->slave transition is much better to always perform the redirection for the following reasons: 1) A Sentinel may go down for some time. When it is back online there is no other way to understand there was a failover. 2) Pointing clients to a slave seems to be always the wrong thing to do. 3) There is no good rationale about handling things differently once an instance is rebooted (runid change) in that case.	2013-04-24 11:34:02 +02:00
antirez	d0c9a2a767	Sentinel: turn old master into a slave when it comes back.	2013-04-22 11:26:29 +02:00
antirez	fcfdbda104	Sentinel: advertise the promoted slave address only after successful setup.	2013-02-11 11:44:14 +01:00
guiquanz	1caf09399e	Fixed many typos. Conflicts fixed, mainly because 2.8 has no cluster support / files: 00-RELEASENOTES src/cluster.c src/crc16.c src/redis-trib.rb src/redis.h	2013-01-19 11:03:19 +01:00
antirez	8ddb23b90c	BSD license added to every C source and header file.	2012-11-08 18:34:04 +01:00
antirez	dfb7194cba	Sentinel: Support for AUTH.	2012-09-27 13:06:17 +02:00
antirez	b8ce9a84c5	Sentinel: reply -IDONTKNOW to get-master-addr-by-name on lack of info. If we don't have any clue about a master since it never replied to INFO so far, reply with an -IDONTKNOW error to SENTINEL get-master-addr-by-name requests.	2012-09-27 13:06:12 +02:00
antirez	1f8bd82332	Sentinel: more easy master redirection if master is a slave. Before this commit Sentienl used to redirect master ip/addr if the current instance reported to be a slave only if this was the first INFO output received, and the role was found to be slave. Now instead also if we find that the runid is different, and the reported role is slave, we also redirect to the reported master ip/addr. This unifies the behavior of Sentinel in the case of a reboot (where it will see the first INFO output with the wrong role and will perform the redirection), with the behavior of Sentinel in the case of a change in what it sees in the INFO output of the master.	2012-09-27 13:06:05 +02:00
antirez	ef792fc950	Sentinel: do not crash against slaves not publishing the runid. Older versions of Redis (before 2.4.17) don't publish the runid field in INFO. This commit makes Sentinel able to handle that without crashing.	2012-09-27 13:06:01 +02:00
antirez	de499f7f7e	Sentinel: INFO command implementation.	2012-09-27 13:05:58 +02:00
antirez	161e137c55	Sentinel: Sentinel-side support for slave priority. The slave priority that is now published by Redis in INFO output is now used by Sentinel in order to select the slave with minimum priority for promotion, and in order to consider slaves with priority set to 0 as not able to play the role of master (they will never be promoted by Sentinel). The "slave-priority" field is now one of the fileds that Sentinel publishes when describing an instance via the SENTINEL commands such as "SENTINEL slaves mastername".	2012-09-27 13:05:49 +02:00
antirez	d480b9ce7f	Sentinel: suppress harmless warning by initializing 'table' to NULL. Note that the assertion guarantees that one of the if branches setting table is always entered.	2012-09-27 13:05:45 +02:00
antirez	fa23fc3363	Sentinel: send SCRIPT KILL on -BUSY reply and SDOWN instance. From the point of view of Redis an instance replying -BUSY is down, since it is effectively not able to reply to user requests. However a looping script is a recoverable condition in Redis if the script still did not performed any write to the dataset. In that case performing a fail over is not optimal, so Sentinel now tries to restore the normal server condition killing the script with a SCRIPT KILL command. If the script already performed some write before entering an infinite (or long enough to timeout) loop, SCRIPT KILL will not work and the fail over will be triggered anyway.	2012-09-27 13:05:41 +02:00
antirez	fc0a0d4aa7	Sentinel: fixed a crash on script execution. The call to sentinelScheduleScriptExecution() lacked the final NULL argument to signal the end of arguments. This resulted into a crash.	2012-09-27 13:05:38 +02:00
antirez	ea9bec50c6	Sentinel: SENTINEL FAILOVER command implemented. This command can be used in order to force a Sentinel instance to start a failover for the specified master, as leader, forcing the failover even if the master is up. The commit also adds some minor refactoring and other improvements to functions already implemented that make them able to work when the master is not in SDOWN condition. For instance slave selection assumed that we ask INFO every second to every slave, this is true only when the master is in SDOWN condition, so slave selection did not worked when the master was not in SDOWN condition.	2012-09-27 13:05:33 +02:00
antirez	26a340095d	Sentinel: client reconfiguration script execution. This commit adds support to optionally execute a script when one of the following events happen: * The failover starts (with a slave already promoted). * The failover ends. * The failover is aborted. The script is called with enough parameters (documented in the example sentinel.conf file) to provide information about the old and new ip:port pair of the master, the role of the sentinel (leader or observer) and the name of the master. The goal of the script is to inform clients of the configuration change in a way specific to the environment Sentinel is running, that can't be implemented in a genereal way inside Sentinel itself.	2012-09-27 13:05:30 +02:00
antirez	524b79d231	Sentinel: when leader in wait-start, sense another leader as race. When we are in wait start, if another leader (or any other external entity) turns a slave into a master, abort the failover, and detect it as an observer. Note that the wait-start state is mainly there for this reason but the abort was yet not implemented. This adds a new sentinel event -failover-abort-race.	2012-09-27 13:05:26 +02:00

1 2

64 Commits