44 Commits

Author SHA1 Message Date
Matt Stancliff
1adc4a7888 Cleanup wording of dictScan() comment
Some language in the comment was difficult
to understand, so this commit: clarifies wording, removes
unnecessary words, and relocates some dependent clauses
closer to what they actually describe.

I also tried to break up longer chains of thought
(if X, then Y, and Q, and also F, so obviously M)
into more manageable chunks for ease of understanding.
2014-10-06 10:01:36 +02:00
Michael Parker
a5044c254d Fix hash table size in comment for dictScan
Closes #1351
2014-10-06 09:58:18 +02:00
antirez
8fb8a474fd Fix dictRehash assert casting type.
Also related to #1929.
2014-08-27 10:31:52 +02:00
antirez
85b3db4065 Cast to right type in dictNext().
This closes issue #1929, the other part was fixed in the context of issue
2014-08-27 10:31:48 +02:00
Cong Ding
ae2cd2e16f Remove unused function
Closes #878
2014-08-27 10:30:11 +02:00
zhanghailei
ccd2c18c1c According to context,the size should be 16 rather than 64 2014-03-11 10:12:19 +01:00
zhanghailei
daa5d2a6c9 FIXED a typo more thank should be more than 2014-03-11 10:12:13 +01:00
antirez
b6610a569d dict.c: added optional callback to dictEmpty().
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).
2013-12-10 18:18:36 +01:00
antirez
7f6743a581 Fixed grammar: before H the article is a, not an. 2013-12-05 16:37:21 +01:00
antirez
736e343509 removed not used vars in dictScan(). 2013-11-05 17:24:12 +01:00
antirez
17785f40b5 dictScan(): empty hash table requires special handling. 2013-10-29 16:20:13 +01:00
antirez
1a06bbe430 Fixed typos in dictScan() comment. 2013-10-29 16:20:02 +01:00
antirez
0880bec4e2 dictScan() algorithm documented. 2013-10-29 16:19:57 +01:00
Pieter Noordhuis
fd22106a98 Fix error in scan algorithm
The irrelevant bits shouldn't be masked to 1. This can result in slots being
skipped when the hash table is resized between calls to the iterator.
2013-10-29 16:19:12 +01:00
Pieter Noordhuis
0dd95e23ab Add SCAN command 2013-10-29 16:18:24 +01:00
antirez
2c018b14fb dictFingerprint(): cast pointers to integer of same size. 2013-08-20 12:09:31 +02:00
antirez
5621272575 Revert "Fixed type in dict.c comment: 265 -> 256."
This reverts commit 009515f9140fc1330c14af994040a8885056986c.
2013-08-19 17:25:57 +02:00
antirez
009515f914 Fixed type in dict.c comment: 265 -> 256. 2013-08-19 15:11:07 +02:00
antirez
837817b2b7 assert.h replaced with redisassert.h when appropriate.
Also a warning was suppressed by including unistd.h in redisassert.h
(needed for _exit()).
2013-08-19 15:02:16 +02:00
antirez
c7da5fc6c1 dictFingerprint() fingerprinting made more robust.
The previous hashing used the trivial algorithm of xoring the integers
together. This is not optimal as it is very likely that different
hash table setups will hash the same, for instance an hash table at the
start of the rehashing process, and at the end, will have the same
fingerprint.

Now we hash N integers in a smarter way, by summing every integer to the
previous hash, and taking the integer hashing again (see the code for
further details). This way it is a lot less likely that we get a
collision. Moreover this way of hashing explicitly protects from the
same set of integers in a different order to hash to the same number.

This commit is related to issue #1240.
2013-08-19 15:02:10 +02:00
antirez
05379f49ee dict.c iterator API misuse protection.
dict.c allows the user to create unsafe iterators, that are iterators
that will not touch the dictionary data structure in any way, preventing
copy on write, but at the same time are limited in their usage.

The limitation is that when itearting with an unsafe iterator, no call
to other dictionary functions must be done inside the iteration loop,
otherwise the dictionary may be incrementally rehashed resulting into
missing elements in the set of the elements returned by the iterator.

However after introducing this kind of iterators a number of bugs were
found due to misuses of the API, and we are still finding
bugs about this issue. The bugs are not trivial to track because the
effect is just missing elements during the iteartion.

This commit introduces auto-detection of the API misuse. The idea is
that an unsafe iterator has a contract: from initialization to the
release of the iterator the dictionary should not change.

So we take a fingerprint of the dictionary state, xoring a few important
dict properties when the unsafe iteartor is initialized. We later check
when the iterator is released if the fingerprint is still the same. If it
is not, we found a misuse of the iterator, as not allowed API calls
changed the internal state of the dictionary.

This code was checked against a real bug, issue #1240.

This is what Redis prints (aborting) when a misuse is detected:

Assertion failed: (iter->fingerprint == dictFingerprint(iter->d)),
function dictReleaseIterator, file dict.c, line 587.
2013-08-19 15:01:58 +02:00
guiquanz
1caf09399e Fixed many typos.
Conflicts fixed, mainly because 2.8 has no cluster support / files:
	00-RELEASENOTES
	src/cluster.c
	src/crc16.c
	src/redis-trib.rb
	src/redis.h
2013-01-19 11:03:19 +01:00
charsyam
1f0d0b094a Remove unnecessary condition in _dictExpandIfNeeded (dict.c) 2012-11-28 11:45:52 +01:00
antirez
8ddb23b90c BSD license added to every C source and header file. 2012-11-08 18:34:04 +01:00
antirez
99c3338c23 Hash function switched to murmurhash2.
The previously used hash function, djbhash, is not secure against
collision attacks even when the seed is randomized as there are simple
ways to find seed-independent collisions.

The new hash function appears to be safe (or much harder to exploit at
least) in this case, and has better distribution.

Better distribution does not always means that's better. For instance in
a fast benchmark with "DEBUG POPULATE 1000000" I obtained the following
results:

    1.6 seconds with djbhash
    2.0 seconds with murmurhash2

This is due to the fact that djbhash will hash objects that follow the
pattern `prefix:<id>` and where the id is numerically near, to near
buckets. This improves the locality.

However in other access patterns with keys that have no relation
murmurhash2 has some (apparently minimal) speed advantage.

On the other hand a better distribution should significantly
improve the quality of the distribution of elements returned with
dictGetRandomKey() that is used in SPOP, SRANDMEMBER, RANDOMKEY, and
other commands.

Everything considered, and under the suspect that this commit fixes a
security issue in Redis, we are switching to the new hash function.
If some serious speed regression will be found in the future we'll be able
to step back easiliy.

This commit fixes issue #663.
2012-10-05 11:16:40 +02:00
Erik Dubbelboer
04779bdfe1 Fixed some spelling errors in the comments 2012-09-27 13:22:42 +02:00
antirez
e337b26050 Even inside #if 0 comments are comments. 2012-04-21 21:49:34 +02:00
antirez
d54943b76d Currenly not used code in dict.c commented out. 2012-04-18 23:56:15 +02:00
huangz1990
085aaef325 fix typo 2012-03-27 23:04:40 +02:00
antirez
b362c111da fixed typo in hahs function seed default value. It is no longer used but fixed to retain the old constant as default anyway. 2012-01-22 01:40:23 +01:00
antirez
a48c8d873b Fix for hash table collision attack. We simply randomize hash table initialization value at startup time. 2012-01-21 23:30:13 +01:00
antirez
aa9a61ccd7 dict.c: added macros in dict.h to set signed and unsigned 64 bit values directly inside the hash entry without using additional memory. 2011-11-08 19:41:29 +01:00
antirez
c0ba9ebe13 dict.c API names modified to be more coincise and consistent. 2011-11-08 17:07:55 +01:00
antirez
71a50956b1 dict.c: added two lower level methods for directly manipulating hash entries. This is useful in order to set 64 bit integers as values directly inside the hash entry (in order to save memory), without casting, and even in 32 bit builds. 2011-11-08 16:57:20 +01:00
antirez
6a7841eb09 added an union in the dict.h structure to store 64 bit integers directly into hash table entries. 2011-11-02 15:28:45 +01:00
antirez
4b53e7365c Introduced a safe iterator interface that can be used to iterate while accessing the dictionary at the same time. Now the default interface is consireded unsafe and should be used only with dictNext() 2011-05-10 10:15:50 +02:00
antirez
05600eb8a7 fixed two diskstore issues, a quasi-deadlock creating problems with I/O speed and a race condition among threads 2011-02-11 11:16:15 +01:00
antirez
1b1f47c915 command lookup process turned into a much more flexible and probably faster hash table 2010-11-03 11:23:59 +01:00
antirez
3856f14759 This should fix Issue 332: when there is a background process saving we still allow the hash tables to grow, but only when a critical treshold is reached. Formerly we prevented the resize at all triggering pathological O(N) behavior. Also there is a fix for the statistics in INFO about the number of keys expired 2010-09-15 14:09:41 +02:00
antirez
e0be2289e9 hash table example commented out in dict.c 2010-07-27 10:00:38 +02:00
Benjamin Kramer
399f2f401c Add zcalloc and use it where appropriate
calloc is more effecient than malloc+memset when the system uses mmap to
allocate memory. mmap always returns zeroed memory so the memset can be
avoided.  The threshold to use mmap is 16k in osx libc and 128k in bsd
libc and glibc. The kernel can lazily allocate the pages, this reduces
memory usage when we have a page table or hash table that is mostly
empty.

This change is most visible when you start a new redis instance with vm
enabled.  You'll see no increased memory usage no matter how big your
page table is.
2010-07-25 00:11:20 +02:00
Benjamin Kramer
d9dd352b36 Remove _dictAlloc and friends
zmalloc calls abort() so _dictPanic will never be called.
2010-07-24 23:10:42 +02:00
Benjamin Kramer
b1e0bd4b9b Reduce code duplication 2010-07-24 22:37:01 +02:00
antirez
e2641e09cc redis.c split into many different C files.
networking related stuff moved into networking.c

moved more code

more work on layout of source code

SDS instantaneuos memory saving. By Pieter and Salvatore at VMware ;)

cleanly compiling again after the first split, now splitting it in more C files

moving more things around... work in progress

split replication code

splitting more

Sets split

Hash split

replication split

even more splitting

more splitting

minor change
2010-07-01 14:38:51 +02:00