All the Redis functions that need to modify the string value of a key in
a destructive way (APPEND, SETBIT, SETRANGE, ...) require to make the
object unshared (if refcount > 1) and encoded in raw format (if encoding
is not already REDIS_ENCODING_RAW).
This was cut & pasted many times in multiple places of the code. This
commit puts the small logic needed into a function called
dbUnshareStringValue().
With the new behavior it is possible to specify just the start in the
range (the end will be assumed to be the first byte), or it is possible
to specify both start and end.
This is useful to change the behavior of the command when looking for
zeros inside a string.
1) If the user specifies both start and end, and no 0 is found inside
the range, the command returns -1.
2) If instead no range is specified, or just the start is given, even
if in the actual string no 0 bit is found, the command returns the
first bit on the right after the end of the string.
So for example if the string stored at key foo is "\xff\xff":
BITPOS foo (returns 16)
BITPOS foo 0 -1 (returns -1)
BITPOS foo 0 (returns 16)
The idea is that when no end is given the user is just looking for the
first bit that is zero and can be set to 1 with SETBIT, as it is
"available". Instead when a specific range is given, we just look for a
zero within the boundaries of the range.
NetBSD-current's libc has a function named popcount.
hiding these extensions using feature macros is not possible because
redis uses other extensions covered by the same feature macro.
eg. inet_aton
When keyspace events are enabled, the overhead is not sever but
noticeable, so this commit introduces the ability to select subclasses
of events in order to avoid to generate events the user is not
interested in.
The events can be selected using redis.conf or CONFIG SET / GET.
remove unsafe and unnecessary cast.
until now, this cast may lead segmentation fault when end > UINT_MAX
setbit foo 0 1
bitcount 0 4294967295
=> ok
bitcount 0 4294967296
=> cause segmentation fault.
Note by @antirez: the commit was modified a bit to also change the
string length type to long, since it's guaranteed to be at max 512 MB in
size, so we can work with the same type across all the code path.
A regression test was also added.
For the C standard char can be either signed or unsigned, it's up to the
compiler, but Redis assumed that it was signed in a few places.
The practical effect of this patch is that now Redis 2.6 will run
correctly in every system where char is unsigned, notably the RaspBerry
PI and other ARM systems with GCC.
Thanks to Georgi Marinov (@eesn on twitter) that reported the problem
and allowed me to use his RaspBerry via SSH to trace and fix the issue!
In the issue #529 an user reported a bug that can be triggered with the
following code:
flushdb
set a
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
bitop or x a b
The bug was introduced with the speed optimization in commit 8bbc076
that specializes every BITOP operation loop up to the minimum length of
the input strings.
However the computation of the minimum length contained an error when a
non existing key was present in the input, after a key that was non zero
length.
This commit fixes the bug and adds a regression test for it.
This commit adds a fast-path to the BITOP that can be used for all the
bytes from 0 to the minimal length of the string, and if there are
at max 16 input keys.
Often the intersected bitmaps are roughly the same size, so this
optimization can provide a 10x speed boost to most real world usages
of the command.
Bytes are processed four full words at a time, in loops specialized
for the specific BITOP sub-command, without the need to check for
length issues with the inputs (since we run this algorithm only as far
as there is data from all the keys at the same time).
The remaining part of the string is intersected in the usual way using
the slow but generic algorith.
It is possible to do better than this with inputs that are not roughly
the same size, sorting the input keys by length, by initializing the
result string in a smarter way, and noticing that the final part of the
output string composed of only data from the longest string does not
need any proecessing since AND, OR and XOR against an empty string does
not alter the output (zero in the first case, and the original string in
the other two cases).
More implementations will be implemented later likely, but this should
be enough to release Redis 2.6-RC4 with bitops merged in.
Note: this commit also adds better testing for BITOP NOT command, that
is currently the faster and hard to optimize further since it just
flips the bits of a single input string.
A bug in the implementation caused BITOP to crash the server if at least
one one of the source objects was integer encoded.
The new implementation takes an additional array of Redis objects
pointers and calls getDecodedObject() to get a reference to a string
encoded object, and then uses decrRefCount() to release the object.
Tests modified to cover the regression and improve coverage.
At Redis's default optimization level the command is now much faster,
always using a constant-time bit manipualtion technique to count bits
instead of GCC builtin popcount, and unrolling the loop.
The current implementation performance is 1.5GB/s in a MBA 11" (1.8 Ghz
i7) compiled with both GCC and clang.
The algorithm used is described here:
http://graphics.stanford.edu/~seander/bithacks.html
bitop.c contains the "Bit related string operations" so it seems more
logical to call it bitops instead of bitop.
This also makes it matching the name of the test (unit/bitops.tcl).