Ziplists had a bug that was discovered while investigating a different
issue, resulting in a corrupted ziplist representation, and a likely
segmentation foult and/or data corruption of the last element of the
ziplist, once the ziplist is accessed again.
The bug happens when a specific set of insertions / deletions is
performed so that an entry is encoded to have a "prevlen" field (the
length of the previous entry) of 5 bytes but with a count that could be
encoded in a "prevlen" field of a since byte. This could happen when the
"cascading update" process called by ziplistInsert()/ziplistDelete() in
certain contitious forces the prevlen to be bigger than necessary in
order to avoid too much data moving around.
Once such an entry is generated, inserting a very small entry
immediately before it will result in a resizing of the ziplist for a
count smaller than the current ziplist length (which is a violation,
inserting code expects the ziplist to get bigger actually). So an FF
byte is inserted in a misplaced position. Moreover a realloc() is
performed with a count smaller than the ziplist current length so the
final bytes could be trashed as well.
SECURITY IMPLICATIONS:
Currently it looks like an attacker can only crash a Redis server by
providing specifically choosen commands. However a FF byte is written
and there are other memory operations that depend on a wrong count, so
even if it is not immediately apparent how to mount an attack in order
to execute code remotely, it is not impossible at all that this could be
done. Attacks always get better... and we did not spent enough time in
order to think how to exploit this issue, but security researchers
or malicious attackers could.
This fixes issue #3043.
Before this fix, after a complete resharding of a master slots
to other nodes, the master remains empty and the slaves migrate away
to other masters with non-zero nodes. However the old master now empty,
is no longer considered a target for migration, because the system has
no way to tell it had slaves in the past.
This fix leaves the algorithm used in the past untouched, but adds a
new rule. When a new or old master which is empty and without slaves,
are assigend with their first slot, if other masters in the cluster have
slaves, they are automatically considered to be targets for replicas
migration.
This fixes a bug introduced by d827dbf, and makes the code consistent
with the logic of always allowing, while the cluster is down, commands
that don't target any key.
As a side effect the code is also simpler now.
CLUSTER SLOTS now includes IDs in the nodes description associated with
a given slot range. Certain client libraries implementations need a way
to reference a node in an unique way, so they were relying on CLUSTER
NODES, that is not a stable API and may change frequently depending on
Redis Cluster future requirements.
The change covers the case where:
1. There is a node we can't reach (in fail or pfail state).
2. We see a different address for this node, in the gossip section sent
to us by a node that, instead, is able to talk with the node we cannot
talk to.
In this case it's a good bet to switch to the address reported by this
node, since there was an address switch and it is able to talk with the
node and we are not.
However previosuly this was done in a dangerous way, by initiating an
handshake. The handshake, using the MEET packet, forces the receiver to
join our cluster, and this is not a good idea. If the node in question
really just switched address, but is the same node, it already knows about
us, so we just need to perform an address update and a reconnection.
So with this commit instead we just update the address of the node,
release the node link if any, and attempt to reconnect in the next
clusterCron() cycle.
The commit also improves debugging messages printed by Cluster during
address or ID switches.
Another leak was fixed in the case of syntax error by restructuring the
allocation strategy for the two dynamic vectors.
We also make sure to always close the cached socket on I/O errors so that
all the I/O errors are handled the same, even if we had a previously
queued error of a different kind from the destination server.
Thanks to Kevin McGehee. Related to issue #3016.
In issue #3016 Kevin McGehee identified multiple very serious issues in
the new implementation of MIGRATE. This commit attempts to restructure
the code in oder to avoid mistakes, an analysis of the new
implementation is in progress in order to check for possible edge cases.
With this commit we preserve the list of nodes that have .slaveof set
to the node, even when the node is turned into a slave, and make sure to
fix the .slaveof pointers to NULL when a node is freed from memory,
regardless of the fact it's a slave or a master.
Basically we try to remember the logical master in the current
configuration even if the logical master advertised it as a slave
already. However we still remember the associations, so that when a node
is freed we can fix them.
This should fix issue #3002.