Commit Graph

235 Commits

Author SHA1 Message Date
b15411df98 Sentinel: log quorum with +monitor event. 2014-02-24 17:10:20 +01:00
6b373edb77 Sentinel: generate +monitor events at startup. 2014-02-24 16:33:55 +01:00
3b7a757468 Sentinel: log +monitor and +set events.
Now that we have a runtime configuration system, it is very important to
be able to log how the Sentinel configuration changes over time because
of API calls.
2014-02-24 16:33:43 +01:00
25cebf7285 Sentinel: added missing exit(1) after checking for config file. 2014-02-24 16:22:52 +01:00
b1c1386374 Sentinel: IDONTKNOW error removed.
This error was conceived for the older version of Sentinel that worked
via master redirection and that was not able to get configuration
updates from other Sentinels via the Pub/Sub channel of masters or
slaves.

This reply does not make sense today, every Sentinel should reply with
the best information it has currently. The error will make even more
sense in the future since the plan is to allow Sentinels to update the
configuration of other Sentinels via gossip with a direct chat without
the prerequisite that they have at least a monitored instance in common.
2014-02-22 17:34:46 +01:00
7d7b3810e7 Sentinel: report instances role switch events.
This is useful mostly for debugging of issues.
2014-02-20 12:13:52 +01:00
7cec9e48ce Sentinel: SENTINEL_SLAVE_RECONF_RETRY_PERIOD -> RECONF_TIMEOUT
Rename define to match the new meaning.
2014-02-18 10:27:38 +01:00
18b8bad53c Sentinel: fix slave promotion timeout.
If we can't reconfigure a slave in time during failover, go forward as
anyway the slave will be fixed by Sentinels in the future, once they
detect it is misconfigured.

Otherwise a failover in progress may never terminate if for some reason
the slave is uncapable to sync with the master while at the same time
it is not disconnected.
2014-02-18 08:50:57 +01:00
e1b77b61f3 Sentinel: better specify startup errors due to config file.
Now it logs the file name if it is not accessible. Also there is a
different error for the missing config file case, and for the non
writable file case.
2014-02-17 16:44:49 +01:00
2d6eb68993 Sentinel: allow SHUTDOWN command in Sentinel mode. 2014-02-07 11:22:24 +01:00
3ff1bb4b2e Sentinel: check arity for SENTINEL MASTER command.
This fixes issue #1530.
2014-01-31 10:13:38 +01:00
d5763dceaf SENTINEL SET master quorum implemented. 2014-01-14 09:23:26 +01:00
fe86f890b0 SENTINEL SET: error on bad option name + flush config on error. 2014-01-13 11:55:57 +01:00
f822516e43 SENTINEL SET implemented.
The new command allows to change master-specific configurations
at runtime. All the settable parameters can be retrivied via the
SENTINEL MASTER command, so there is no equivalent "GET" command.
2014-01-13 11:53:29 +01:00
3cdcaff069 Sentinel: fix wrong arity error message. 2014-01-13 11:05:13 +01:00
964f6b17e9 Sentinel: SENTINEL REMOVE command added.
The command totally removes a monitored master.
2014-01-10 15:39:36 +01:00
cf2835519e Sentinel: releaseSentinelRedisInstance() top comment fixed.
The claim about unlinking the instance from the connected hash tables
was the opposite of the reality. Also the current actual behavior is
safer in most cases, so it is better to manually unlink when needed.
2014-01-10 15:33:42 +01:00
9d0f46c6f5 Sentinel: flush config on disk when new master is added. 2014-01-10 15:22:06 +01:00
39f9f449b0 Sentinel: SENTINEL MONITOR command implemented.
It allows to add new masters to monitor at runtime.
2014-01-10 15:18:24 +01:00
c42e4bd0b6 Sentinel: added SENTINEL MASTER <name> command.
With SENTINEL MASTERS it was already possible to list all the configured
masters, but not a specific one.
2014-01-10 14:41:52 +01:00
2bb9cd464e Add all the configurable fields to addReplySentinelRedisInstance().
Note: the auth password with the master is voluntarily not exposed.
2014-01-10 14:31:41 +01:00
5a7d04ee7b Trip comment to 80 cols in SentinelCommand(). 2014-01-10 14:13:04 +01:00
5320148883 Sentinel: dead code removed. 2013-12-13 11:01:13 +01:00
2eb781b35b dict.c: added optional callback to dictEmpty().
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).
2013-12-10 18:46:24 +01:00
c590549e40 Sentinel: fix reported role info sampling.
The way the role change was recoded was not sane and too much
convoluted, causing the role information to be not always updated.

This commit fixes issue #1445.
2013-12-06 12:46:56 +01:00
2b414a4b5f Sentinel: fix reported role fields when master is reset.
When there is a master address switch, the reported role must be set to
master so that we have a chance to re-sample the INFO output to check if
the new address is reporting the right role.

Otherwise if the role was wrong, it will be sensed as wrong even after
the address switch, and for enough time according to the role change
time, for Sentinel consider the master SDOWN.

This fixes isue #1446, that describes the effects of this bug in
practice.
2013-12-06 11:37:46 +01:00
11e81a1e9a Fixed grammar: before H the article is a, not an. 2013-12-05 16:35:32 +01:00
f80cf7363a Sentinel: don't write HZ when flushing config.
See issue #1419.
2013-12-02 15:56:10 +01:00
dffebbc904 Sentinel: better time desynchronization.
Sentinels are now desynchronized in a better way changing the time
handler frequency between 10 and 20 HZ. This way on average a
desynchronization of 25 milliesconds is produced that should be larger
enough compared to network latency, avoiding most split-brain condition
during the vote.

Now that the clocks are desynchronized, to have larger random delays when
performing operations can be easily achieved in the following way.
Take as example the function that starts the failover, that is
called with a frequency between 10 and 20 HZ and will start the
failover every time there are the conditions. By just adding as an
additional condition something like rand()%4 == 0, we can amplify the
desynchronization between Sentinel instances easily.

See issue #1419.
2013-12-02 12:29:42 +01:00
0addf8aff1 Sentinel: log vote received from other Sentinels. 2013-11-28 15:23:46 +01:00
86a540a66e fix a bug in sentinel.c about pub/sub link 2013-11-26 19:55:51 +08:00
6f4fd55762 Sentinel: fixes inverted strcmp() test preventing config updates.
The result of this one-char bug was pretty serious, if the new master
had the same port of the previous master, but just a different IP
address, non-leader Sentinels would not be able to recognize the
configuration change.

This commit fixes issue #1394.

Many thanks to @shanemadden that reported the bug and helped
investigating it.
2013-11-25 10:59:53 +01:00
8d547ebd56 Sentinel: fix type specifier for Hello msg generation.
This fixes issue #1395.
2013-11-25 10:24:34 +01:00
cc6053681f Sentinel: different comments updated to new implementation. 2013-11-21 16:22:59 +01:00
685e79998c Sentinel: cleanup around SENTINEL_INFO_VALIDITY_TIME. 2013-11-21 16:05:41 +01:00
489d889726 Sentinel: removed mem leak and useless code. 2013-11-21 15:43:55 +01:00
f55ad3038f Sentinel: manual failover works again. 2013-11-21 12:39:47 +01:00
297de1ab26 Sentinel: test for writable config file.
This commit introduces a funciton called when Sentinel is ready for
normal operations to avoid putting Sentinel specific stuff in redis.c.
2013-11-21 12:28:15 +01:00
d920177f8d Sentinel: check for disconnected links in sentinelSendHello().
Does not fix any bug as the test is performed by the caller, but better
to have the check.
2013-11-21 11:35:50 +01:00
8810167d13 Sentinel: Hello message sending code refactored. 2013-11-21 11:31:06 +01:00
0101c2bcfe Sentinel: select slave with best (greater) replication offset. 2013-11-20 16:05:36 +01:00
a6ebd910d8 Sentinel: take the replication offset in slaves state. 2013-11-20 15:53:21 +01:00
37a51a2568 Sentinel: distinguish between is-master-down-by-addr requests.
Some are just to know if the master is down, and in this case the runid
in the request is set to "*", others are actually in order to seek for a
vote and get elected. In the latter case the runid is set to the runid
of the instance seeking for the vote.
2013-11-19 16:50:04 +01:00
b22d1beea0 Sentinel: various fixes to leader election implementation. 2013-11-19 16:20:42 +01:00
1f9728cb20 Sentinel: failover script execution fixed. 2013-11-19 12:34:46 +01:00
90635488ce Sentinel: no longer used defines removed. 2013-11-19 11:24:36 +01:00
0a35f65301 Sentinel: when writing config on disk, remember sentinels runid. 2013-11-19 11:11:43 +01:00
5450833d02 Sentinel: arity of known-sentinel/slave is 4 not 3. 2013-11-19 11:03:47 +01:00
b8a94463b7 Sentinel: rewriteConfigSentinelOption() sub-iterators var typo fixed. 2013-11-19 10:59:50 +01:00
16237d78c8 Sentinel: call sentinelFlushConfig() to persist state when needed.
Also the sentinel configuration rewriting was modified in order to
account for failover in progress, where we need to provide the promoted
slave address as master address, and the old master address as one of
the slaves address.
2013-11-19 10:55:43 +01:00