Commit Graph

3376 Commits

Author SHA1 Message Date
fc93198ff9 Sentinel: various fixes to leader election implementation. 2013-11-21 15:22:52 +01:00
44b3684633 Sentinel: failover script execution fixed. 2013-11-21 15:22:48 +01:00
6d0400f569 Sentinel: no longer used defines removed. 2013-11-21 15:22:44 +01:00
1bbce5bf01 Sentinel: when writing config on disk, remember sentinels runid. 2013-11-21 15:22:27 +01:00
679dc8b09d Sentinel: arity of known-sentinel/slave is 4 not 3. 2013-11-21 15:22:21 +01:00
3e1fc6278d Sentinel: rewriteConfigSentinelOption() sub-iterators var typo fixed. 2013-11-21 15:22:18 +01:00
9a9c0cfaa6 Sentinel: call sentinelFlushConfig() to persist state when needed.
Also the sentinel configuration rewriting was modified in order to
account for failover in progress, where we need to provide the promoted
slave address as master address, and the old master address as one of
the slaves address.
2013-11-21 15:22:14 +01:00
93d924ff1c Sentinel: sentinelFlushConfig() to CONFIG REWRITE + fsync. 2013-11-21 15:22:11 +01:00
a52909c5f2 Sentinel: CONFIG REWRITE support for Sentinel config. 2013-11-21 15:22:07 +01:00
8c3e197040 Sentinel: can-failover option removed, many comments fixed. 2013-11-21 15:22:02 +01:00
04b1fb0b1a Fix typo 'configuraiton' in rewriteConfigRewriteLine() comment. 2013-11-21 15:21:58 +01:00
0b9853ecdc Sentinel: added config options useful to take state on config rewrite.
We'll use CONFIG REWRITE (internally) in order to store the new
configuration of a Sentinel after the internal state changes. In order
to do so, we need configuration options (that usually the user will not
touch at all) about config epoch of the master, Sentinels and Slaves
known for this master, and so forth.
2013-11-21 15:21:55 +01:00
737062745d Sentinel: failover abort function simplified. 2013-11-21 15:21:50 +01:00
66b03c1a40 Sentinel: slaves reconfig delay modified.
The time Sentinel waits since the slave is detected to be configured to
the wrong master, before reconfiguring it, is now the failover_timeout
time as this makes more sense in order to give the Sentinel performing
the failover enoung time to reconfigure the slaves slowly (if required
by the configuration).

Also we now PUBLISH more frequently the new configuraiton as this allows
to switch the reapprearing master back to slave faster.
2013-11-21 15:21:46 +01:00
8ba31c218b Sentinel: failover restart time is now multiple of failover timeout.
Also defaulf failover timeout changed to 3 minutes as the failover is a
fairly fast procedure most of the times, unless there are a very big
number of slaves and the user picked to configure them sequentially (in
that case the user should change the failover timeout accordingly).
2013-11-21 15:21:41 +01:00
ccaba966bc Sentinel: state machine and timeouts simplified. 2013-11-21 15:21:37 +01:00
3c4497e83c Sentinel: election timeout define. 2013-11-21 15:21:34 +01:00
e15ba6a697 Sentinel: fix address of master in Hello messages.
Once we switched configuration during a failover, we should advertise
the new address.

This was a serious race condition as the Sentinel performing the
failover for a moment advertised the old address with the new
configuration epoch: once trasmitted to the other Sentinels the broken
configuration would remain there forever, until the next failover
(because a greater configuration epoch is required to overwrite an older
one).
2013-11-21 15:21:30 +01:00
1a6abe7d79 Sentinel: master address selection in get-master-address refactored. 2013-11-21 15:21:26 +01:00
0eeb0a0782 Sentinel: fix conditional to only affect slaves with wrong master. 2013-11-21 15:21:21 +01:00
64c8de8657 Sentinel: simplify and refactor slave reconfig code. 2013-11-21 15:20:56 +01:00
782f9cacaf Sentinel: reconfigure slaves to right master. 2013-11-21 15:20:44 +01:00
7dbc0a63f5 Sentinel: remember last time slave changed master. 2013-11-21 15:20:40 +01:00
612dbb2a91 Sentinel: redirect-to-master is not ok with new algorithm.
Now Sentinel believe the current configuration is always the winner and
should be applied by Sentinels instead of trying to adapt our view of
the cluster based on what we observe.

So the only way to modify what a Sentinel believe to be the truth is to
win an election and advertise the new configuration via Pub / Sub with a
greater configuration epoch.
2013-11-21 15:20:36 +01:00
4ccf807abc Sentinel: safer slave reconfig, master reported role should match. 2013-11-21 15:20:31 +01:00
e98d82c639 Sentinel: role reporting fixed and added in SENTINEL output. 2013-11-21 15:20:25 +01:00
be19e5450c Sentinel: being a master and reporting as slave is considered SDOWN. 2013-11-21 15:20:20 +01:00
8c551d65a1 Sentinel: make sure role_reported is always updated. 2013-11-21 15:20:15 +01:00
9577fed8a3 Sentinel: track role change time. Wait before reconfigurations. 2013-11-21 15:20:11 +01:00
fc10fb17da Sentinel: fix no-down check in master->slave conversion code. 2013-11-21 15:20:07 +01:00
b02ef3d59a Sentinel: readd slaves back after a master reset. 2013-11-21 15:19:56 +01:00
1a1dc3de38 Sentinel: sentinelResetMaster() new flag to avoid removing set of sentinels.
This commit also removes some dead code and cleanup generic flags.
2013-11-21 15:19:52 +01:00
9f0e52a13d Sentinel: receive Pub/Sub messages from slaves. 2013-11-21 15:19:49 +01:00
f7604b4c07 Sentinel: change event name when converting master to slave. 2013-11-21 15:19:45 +01:00
1569326277 Sentinel: added config-epoch to SENTINEL masters output. 2013-11-21 15:19:41 +01:00
9f20780de6 Sentinel: new failover algo, desync slaves and update config epoch. 2013-11-21 15:19:34 +01:00
2488257ad8 Sentinel: when starting failover seek for votes ASAP. 2013-11-21 15:19:30 +01:00
6593f33222 Sentinel: +new-epoch events. 2013-11-21 15:19:26 +01:00
48acc675dd Sentinel: wait some time between failover attempts. 2013-11-21 15:19:22 +01:00
447c2787a0 Sentinel: allow to vote for myself. 2013-11-21 15:19:16 +01:00
eba4775b5d Sentinel: fix PUBLISH to masters and slaves. 2013-11-21 15:19:11 +01:00
b95c6ed7b7 Sentinel: epoch introduced in leader vote. 2013-11-21 15:18:52 +01:00
663d79c0d5 Sentinel: leadership handling changes WIP.
Changes to leadership handling.

Now the leader gets selected by every Sentinel, for a specified epoch,
when the SENTINEL is-master-down-by-addr is sent.

This command now includes the runid and the currentEpoch of the instance
seeking for a vote. The Sentinel only votes a single time in a given
epoch.

Still a work in progress, does not even compile at this stage.
2013-11-21 15:18:45 +01:00
b72985d7f7 Sentinel: handle Hello messages received via slaves correctly.
Even when messages are received via the slave, we should perform
operations (like adding a new Sentinel) in the context of the master.
2013-11-21 15:18:41 +01:00
fe7f96f18c Sentinel: remove code not useful in the new design. 2013-11-21 15:18:36 +01:00
be2ef1b59f Sentinel: epoch introduced.
Sentinel state now includes the idea of current epoch and config epoch.
In the Hello message, that is now published both on masters and slaves,
a Sentinel no longer just advertises itself but also broadcasts its
current view of the configuration: the master name / ip / port and its
current epoch.

Sentinels receiving such information switch to the new master if the
configuration epoch received is newer and the ip / port of the master
are indeed different compared to the previos ones.
2013-11-21 15:18:31 +01:00
98682cd178 Log to what master a slave is going to connect to. 2013-11-11 09:25:41 +01:00
68b911478f Fixed typo in release candidate notes. 2013-11-10 10:25:09 +01:00
875bc51909 Fix broken rdbWriteRaw() return value check in rdb.c.
Thanks to @PhoneLi for reporting.
2013-11-07 23:54:46 +01:00
c874e39c45 Sentinel: sentinelSendSlaveOf() was missing a var and the prototype. 2013-11-06 11:29:57 +01:00