The quicklist takes a cached version of the ziplist representation size
in bytes. The implementation must update this length every time the
underlying ziplist changes. However quicklistReplaceAtIndex() failed to
fix the length.
During LSET calls, the size of the ziplist blob and the cached size
inside the quicklist diverged. Later, when this size is used in an
authoritative way, for example during nodes splitting in order to copy
the nodes, we end with a duplicated node that may contain random
garbage.
This commit should fix issue #3343, however several problems were found
reviewing the quicklist.c code in search of this bug that should be
addressed soon or later.
For example:
1. To take a cached ziplist length is fragile since failing to update it
leads to this kind of issues.
2. The node splitting code needs auditing. For example it works just for
a side effect of ziplistDeleteRange() to be able to cope with a wrong
count of elements to remove. The code inside quicklist.c assumes that
-1 means "delete till the end" while actually it's just a count of how
many elements to delete, and is an unsigned count. So -1 gets converted
into the maximum integer, and just by chance the ziplist code stops
deleting elements after there are no more to delete.
3. Node splitting is extremely inefficient, it copies the node and
removes elements from both nodes even when actually there is to move a
single entry from one node to the other, or when the new resulting node
is empty at all so there is nothing to copy but just to create a new
node.
However at least for Redis 3.2 to introduce fresh code inside
quicklist.c may be even more risky, so instead I'm writing a better
fuzzy tester to stress the internals a bit more in order to anticipate
other possible bugs.
This bug was found using a fuzzy tester written after having some clue
about where the bug could be. The tester eventually created a ~2000
commands sequence able to always crash Redis. I wrote a better version
of the tester that searched for the smallest sequence that could crash
Redis automatically. Later this smaller sequence was minimized by
removing random commands till it still crashed the server. This resulted
into a sequence of 7 commands. With this small sequence it was just a
matter of filling the code with enough printf() to understand enough
state to fix the bug.
This bug most experienced effect was an inability of Redis to
reconfigure back old masters to slaves after they are reachable again
after a failover. This was due to failing to reset the count of the
pending commands properly, so the master appeared fovever down.
Was introduced in Redis 3.2 new Sentinel connection sharing feature
which is a lot more complex than the 3.0 code, but more scalable.
Many thanks to people reporting the issue, and especially to
@sskorgal for investigating the issue in depth.
Hopefully closes#3285.
I recently introduced populating the autocomplete help array with the
COMMAND command if available. However this was performed before parsing
the arguments, defaulting to instance 6379. After the connection is
performed it remains stable.
The effect is that if there is an instance running on port 6339,
whatever port you specify is ignored and 6379 is connected to instead.
The right port will be selected only after a reconnection.
Close#3314.
Reference issue #3218.
Checking the code I can't find a reason why the original RESTORE
code was so opinionated about restoring only the current version. The
code in to `rdb.c` appears to be capable as always to restore data from
older versions of Redis, and the only places where it is needed the
current version in order to correctly restore data, is while loading the
opcodes, not the values itself as it happens in the case of RESTORE.
For the above reasons, this commit enables RESTORE to accept older
versions of values payloads.
Comment format fixed + local var modified from camel case to underscore
separators as Redis code base normally does (camel case is mostly used
for global symbols like structure names, function names, global vars,
...).
I'm the author of this line but I can't see a good reason for it to
don't be a typo, a step of 26 should be valid with 52 bits per
coordinate, moreover the line was:
if (step > 26) step = 25;
So a step of 26 was actually already used, except when one of 27 was
computed (which is invalid) only then it was trimmed to 25 instead of
26.
All tests passing after the change.
Probably there is no compiler that will actaully break the code or raise
a signal for unsigned -> signed overflowing conversion, still it was
apparently possible to write it in a more correct way.
All tests passing.
This change is documented in deps/README.md but was lost in one way or
the other, neutralizing the benefits of 24 bytes size classes (and
others).
Close#3208.
Use the COMMAND output to fill with partial information the built-in
help. This makes redis-cli able to at least complete commands that are
exported by the Redis server it is connected to, but were not available
in the help.h file when the redis-cli binary was compiled.