Commit Graph

397 Commits

Author SHA1 Message Date
3d448bda39 Cluster: call clusterHandleSlaveFailover() when our master is down. 2013-03-13 12:44:02 +01:00
f0b807cd47 Cluster: update cluster state on PFAIL flag set/cleared on nodes. 2013-03-07 15:40:53 +01:00
299b8f76c2 Cluster: mark cluster state as fail of majority of masters is unreachable. 2013-03-07 15:36:59 +01:00
abf06fd5ff Cluster: log global cluster state change. 2013-03-07 15:22:32 +01:00
3dad8196b7 Cluster: clusterUpdateState() function simplified.
Also the NEEDHELP Cluster state was removed as it will no longer be
used by Redis Cluster.
2013-03-06 18:25:40 +01:00
011fa89ac9 Cluster: sdssplitargs_free() -> sdsfreesplitres(). 2013-03-06 12:38:06 +01:00
1025dd7786 Cluster: connect to our master ASAP after startup if we are a slave node. 2013-03-05 16:12:08 +01:00
bac57ad14b Cluster: more robust FAIL flag cleaup.
If we have a master in FAIL state that's reachable again, and apparently
no one is going to serve its slots, clear the FAIL flag and let the
cluster continue with its operations again.
2013-03-05 15:05:32 +01:00
1a02b7440a Cluster: new node field fail_time.
This is the unix time at which we set the FAIL flag for the node.
It is only valid if FAIL is set.

The idea is to use it in order to make the cluster more robust, for
instance in order to revert a FAIL state if it is long-standing but
still slots are assigned to this node, that is, no one is going to fix
these slots apparently.
2013-03-05 13:15:05 +01:00
e4b481a5f6 Cluster: A comment updated in clusterCron(). 2013-03-05 12:17:30 +01:00
d728ec6dee Cluster: send a ping to every node we never contacted in timeout/2 seconds.
Usually we try to send just 1 ping every second, however when we detect
we are going to have unreliable failure detection because we can't ping
some node in time, send an additional ping.

This should only happen with very large clusters or when the the node
timeout is set to a very low value.
2013-03-05 12:16:02 +01:00
e7628be2a7 Cluster: set node->slaveof correctly when a node state is updated. 2013-03-05 11:50:11 +01:00
d6457577d4 Cluster: don't perform startup slots sanity check for slaves.
If we are a cluster node the DB content will not match our configured
slots. Don't do the check at all.
2013-03-04 19:47:00 +01:00
d334897e80 Cluster: fix maximum line length when loading config.
There are pathological cases where the line can be even longer a single
node may contain all the slots in importing/migrating state.
2013-03-04 19:45:36 +01:00
b8a28bf442 Cluster: actually setup replication in CLUSTER REPLICATE. 2013-03-04 15:27:58 +01:00
0c01088b51 Cluster: REPLICATE subcommand and stub for clusterSetMaster(). 2013-03-04 13:15:09 +01:00
bc84c399f8 adding check error code
adding check error code
2013-03-04 11:20:11 +01:00
caf9b24a7d Cluster: don't set the slot as unassigned because of PONG info.
As stated in the comment this is usually due to a resharding in progress
so the client should be still redirected to the old node that will
handle the redirection elsewhere.
2013-02-28 15:54:29 +01:00
0d77440b26 Cluster: better handling of slots changes in PONG packets.
The new code makes sure that the node slots bitmap is always consistent
with the cluster->slots array.
2013-02-28 15:41:54 +01:00
5f8fd27ace Cluster: refactoring of clusterNode*Bit to use helper bitmap functions. 2013-02-28 15:23:09 +01:00
d21d6b666f Cluster: use node->numslots instead of popcount() where possible. 2013-02-28 15:13:32 +01:00
4521115b17 Cluster: new field in cluster node structure, "numslots".
Before a relatively slow popcount() operation was needed every time we
needed to get the number of slots served by a given cluster node.
Now we just need to check an integer that is taken in sync with the
bitmap.
2013-02-28 15:11:05 +01:00
a2566d6618 Cluster: don't gossip about nodes that are not useful to the cluster. 2013-02-28 15:00:09 +01:00
d45d184118 Cluster: CLUSTER FORGET implemented. 2013-02-27 17:55:59 +01:00
d2b8281b3f Cluster: added a missing return on CLUSTER SETSLOT. 2013-02-27 17:53:48 +01:00
d20dea3eb7 Cluster: blank node address when flagging it as NOADDR. 2013-02-27 17:09:33 +01:00
2dcb5ab72b Cluster: add comments in sub-sections of CLUSTER command. 2013-02-27 16:12:59 +01:00
f9b5ca29fd Use GCC printf format attribute for redisLog().
This commit also fixes redisLog() statements producing warnings.
2013-02-27 12:27:15 +01:00
d0992d6e8b Cluster: a few random fixes to the new failure detection. 2013-02-26 15:15:44 +01:00
f288b07563 Cluster: log the event when we clear the FAIL flag. 2013-02-26 15:03:38 +01:00
97ffcd351b Cluster: use the failure report API to reimplement failure detection.
The new system detects a failure only when there is quorum from masters.
2013-02-26 14:58:39 +01:00
1b1b3f6c06 Cluster: invert two functions declarations in more natural order. 2013-02-26 11:19:48 +01:00
d5e8b0a47f Cluster: cleanup idle failure reports every time we remove one.
This is not very important as anyway when the function counting the
number of reports is called the cleanup is performed. However with this
change if only part of the nodes that reported the failure will report
the node is back ok, we'll cleanup the older entries ASAP. In complex
split net split scenarios, and when we are dealing with clusters having
nodes in the order of ~ 1000, this can save some CPU.
2013-02-26 11:15:18 +01:00
9cb578ced0 Cluster: new function clusterNodeDelFailureReport() for failure reports.
This is the missing part of the API that will be used to reimplement
failure detection of Cluster nodes.
2013-02-25 19:13:22 +01:00
18f537083a Cluster: no limits for the count parameter of CLUSTER GETKEYSINSLOT.
Not sure why I set a limit to 1 million keys, there is no reason for
this artificial limit, and anyway this is s a stupid limit because it is
already high enough to create latency issues. So let's the users shoot
on their feet because maybe they just actually know what they are doing.
2013-02-25 12:41:13 +01:00
544bbe5387 Cluster: validate slot number in CLUSTER COUNTKEYSINSLOT. 2013-02-25 12:40:32 +01:00
d4fa40655d Cluster: new sub-command CLUSTER COUNTKEYSINSLOT.
The new sub-command uses the new countKeysInSlot() API and allows a
cluster client to get the number of keys for a given hashslot.
2013-02-25 12:04:31 +01:00
a517c89321 Cluster: verifyClusterConfigWithData() implemented. 2013-02-25 11:43:49 +01:00
d2154254be Cluster: fix case for getKeysInSlot() and countKeysInSlot().
Redis functions start in low case. A few functions about cluster were
capitalized the wrong way.
2013-02-25 11:25:40 +01:00
c2eb4a606f Cluster: use CountKeysInSlot() when we just need the count. 2013-02-25 11:23:04 +01:00
ad3bca1fdf Cluster: added stub for verifyClusterConfigWithData().
See the top-comment for the function in this commit for details about
what the function is supposed to do.
2013-02-25 11:20:17 +01:00
825e07f2fd Cluster: if no previous config exists, create the myself node as master. 2013-02-22 19:24:01 +01:00
f4093753e4 Cluster: add cluster_size field in CLUSTER INFO output. 2013-02-22 19:20:38 +01:00
d218a4e244 Cluster: new state information, cluster size.
The definition of cluster size is: the number of known nodes in the
cluster that are masters and serving at least an hash slot.
2013-02-22 19:18:30 +01:00
5c55ed9388 Cluster: remove warning adding clusterNodeSetSlotBit() prototype. 2013-02-22 17:45:49 +01:00
974929770b Cluster: introduced a failure reports system.
A §Redis Cluster node used to mark a node as failing when itself
detected a failure for that node, and a single acknowledge was received
about the possible failure state.

The new API will be used in order to possible to require that N other
nodes have a PFAIL or FAIL state for a given node for a node to set it
as failing.
2013-02-22 17:43:35 +01:00
07b6322735 Cluster: more correct update of slots state when PONG is received. 2013-02-21 16:52:06 +01:00
c6da9d9fac Call clusterUpdateState() after CLUSTER SETSLOT too. 2013-02-21 16:31:22 +01:00
3a99d1228a Aesthetic change to make a line more 80-cols friendly. 2013-02-21 16:24:48 +01:00
dc4af60628 Cluster: clusterAddSlot() was not doing what stated in the comment. 2013-02-21 11:51:17 +01:00