#sidebar
SortedSetCommandsSidebar 1.1) =
1.3.4) =
+
Time complexity: O(log(N))+O(M) with N being the number of elements in the sorted set and M the number of elements returned by the command, so if M is constant (for instance you always ask for the first ten elements with LIMIT) you can consider it O(log(N))Return the all the elements in the sorted set at key with a score between_min_ and max (including elements with score equal to min or max).
The elements having the same score are returned sorted lexicographically asASCII strings (this follows from a property of Redis sorted sets and does notinvolve further computation).
-
Using the optional LIMIT it's possible to get only a range of the matchingelements in an SQL-alike way. Note that if offset is large the commandsneeds to traverse the list for offset elements and this adds up to theO(M) figure.
+
Using the optional LIMIT it's possible to get only a range of the matchingelements in an SQL-alike way. Note that if offset is large the commandsneeds to traverse the list for offset elements and this adds up to theO(M) figure.
+
The ZCOUNT command is similar to ZRANGEBYSCORE but instead of returningthe actual elements in the specified interval, it just returns the numberof matching elements.
min and
max can be -inf and +inf, so that you are not required to know what's the greatest or smallest element in order to take, for instance, elements "up to a given value".
Also while the interval is for default closed (inclusive) it's possible to specify open intervals prefixing the score with a "(" character, so for instance:
ZRANGEBYSCORE zset (1.3 5
@@ -40,7 +42,7 @@ Will return all the values with score > 1.3 and <= 5, while for ins
ZRANGEBYSCORE zset (5 (10
Will return all the values with score
> 5 and < 10 (5 and 10 excluded).
-
Multi bulk reply, specifically a list of elements in the specified score range.
+
ZRANGEBYSCORE returns a
Multi bulk reply specifically a list of elements in the specified score range.
ZCOUNT returns a
Integer reply specifically the number of elements matching the specified score range.
redis> zadd zset 1 foo
@@ -56,6 +58,8 @@ redis> zrangebyscore zset -inf +inf
2. "bar"
3. "biz"
4. "foz"
+redis> zcount zset 1 2
+(integer) 2
redis> zrangebyscore zset 1 2
1. "foo"
2. "bar"
diff --git a/doc/ZunionCommand.html b/doc/ZunionCommand.html
index edb52a9c..cb5b844d 100644
--- a/doc/ZunionCommand.html
+++ b/doc/ZunionCommand.html
@@ -16,7 +16,7 @@
ZunionCommand
@@ -27,8 +27,9 @@
-
1.3.5) =
Time complexity: O(N) + O(M log(M)) with N being the sum of the sizes of the input sorted sets, and M being the number of elements in the resulting sorted setCreates a union or intersection of N sorted sets given by keys k1 through kN, and stores it at dstkey. It is mandatory to provide the number of input keys N, before passing the input keys and the other (optional) arguments.
-
As the terms imply, the ZINTER command requires an element to be present in each of the given inputs to be inserted in the result. The ZUNION command inserts all elements across all inputs.
+
1.3.12) =
+
1.3.12) =
Time complexity: O(N) + O(M log(M)) with N being the sum of the sizes of the input sorted sets, and M being the number of elements in the resulting sorted setCreates a union or intersection of N sorted sets given by keys k1 through kN, and stores it at dstkey. It is mandatory to provide the number of input keys N, before passing the input keys and the other (optional) arguments.
+
As the terms imply, the ZINTERSTORE command requires an element to be present in each of the given inputs to be inserted in the result. The ZUNIONSTORE command inserts all elements across all inputs.
Using the WEIGHTS option, it is possible to add weight to each input sorted set. This means that the score of each element in the sorted set is first multiplied by this weight before being passed to the aggregation. When this option is not given, all weights default to 1.
With the AGGREGATE option, it's possible to specify how the results of the union or intersection are aggregated. This option defaults to SUM, where the score of an element is summed across the inputs where it exists. When this option is set to be either MIN or MAX, the resulting set will contain the minimum or maximum score of an element across the inputs where it exists.
Integer reply, specifically the number of elements in the sorted set at
dstkey.
diff --git a/doc/ZunionstoreCommand.html b/doc/ZunionstoreCommand.html
index a9f74326..862c38bb 100644
--- a/doc/ZunionstoreCommand.html
+++ b/doc/ZunionstoreCommand.html
@@ -16,7 +16,7 @@
ZunionstoreCommand
@@ -27,8 +27,10 @@
-
1.3.5) =
Time complexity: O(N) + O(M log(M)) with N being the sum of the sizes of the input sorted sets, and M being the number of elements in the resulting sorted setCreates a union or intersection of N sorted sets given by keys k1 through kN, and stores it at dstkey. It is mandatory to provide the number of input keys N, before passing the input keys and the other (optional) arguments.
-
As the terms imply, the ZINTER command requires an element to be present in each of the given inputs to be inserted in the result. The ZUNION command inserts all elements across all inputs.
+
1.3.12) =
+
1.3.12) =
+
Time complexity: O(N) + O(M log(M)) with N being the sum of the sizes of the input sorted sets, and M being the number of elements in the resulting sorted setCreates a union or intersection of N sorted sets given by keys k1 through kN, and stores it at dstkey. It is mandatory to provide the number of input keys N, before passing the input keys and the other (optional) arguments.
+
As the terms imply, the ZINTERSTORE command requires an element to be present in each of the given inputs to be inserted in the result. The ZUNIONSTORE command inserts all elements across all inputs.
Using the WEIGHTS option, it is possible to add weight to each input sorted set. This means that the score of each element in the sorted set is first multiplied by this weight before being passed to the aggregation. When this option is not given, all weights default to 1.
With the AGGREGATE option, it's possible to specify how the results of the union or intersection are aggregated. This option defaults to SUM, where the score of an element is summed across the inputs where it exists. When this option is set to be either MIN or MAX, the resulting set will contain the minimum or maximum score of an element across the inputs where it exists.
Integer reply, specifically the number of elements in the sorted set at
dstkey.
diff --git a/doc/index.html b/doc/index.html
index 2cf5d9a8..1c72b230 100644
--- a/doc/index.html
+++ b/doc/index.html
@@ -26,12 +26,12 @@
- = Redis Documentation =
Russian TranslationHello! The followings are pointers to different parts of the Redis Documentation.
-
- The Redis Replication HOWTO is what you need to read in order to understand how Redis master
<->
slave replication works. - The Append Only File HOWTO explains how the alternative Redis durability mode works. AOF is an alternative to snapshotting on disk from time to time (the default).
- Virutal Memory User Guide. A simple to understand guide about using and configuring the Redis Virtual Memory.
+ = Redis Documentation =
Russian TranslationHello! The followings are pointers to different parts of the Redis Documentation.
+
- The Redis Replication HOWTO is what you need to read in order to understand how Redis master
<->
slave replication works. - The Append Only File HOWTO explains how the alternative Redis durability mode works. AOF is an alternative to snapshotting on disk from time to time (the default).
- Virtual Memory User Guide. A simple to understand guide about using and configuring the Redis Virtual Memory.
- The Protocol Specification is all you need in order to implement a Redis client library for a missing language. PHP, Python, Ruby and Erlang are already supported.
- Look at Redis Internals if you are interested in the implementation details of the Redis server.
-
+
diff --git a/src/Makefile b/src/Makefile
index 0af70f17..e1e989c6 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -15,11 +15,11 @@ endif
CCOPT= $(CFLAGS) $(CCLINK) $(ARCH) $(PROF)
DEBUG?= -g -rdynamic -ggdb
-INSTALL_TOP= /usr/local
-INSTALL_BIN= $(INSTALL_TOP)/bin
+PREFIX= /usr/local
+INSTALL_BIN= $(PREFIX)/bin
INSTALL= cp -p
-OBJ = adlist.o ae.o anet.o dict.o redis.o sds.o zmalloc.o lzf_c.o lzf_d.o pqsort.o zipmap.o sha1.o ziplist.o release.o networking.o util.o object.o db.o replication.o rdb.o t_string.o t_list.o t_set.o t_zset.o t_hash.o config.o aof.o vm.o pubsub.o multi.o debug.o sort.o
+OBJ = adlist.o ae.o anet.o dict.o redis.o sds.o zmalloc.o lzf_c.o lzf_d.o pqsort.o zipmap.o sha1.o ziplist.o release.o networking.o util.o object.o db.o replication.o rdb.o t_string.o t_list.o t_set.o t_zset.o t_hash.o config.o aof.o vm.o pubsub.o multi.o debug.o sort.o intset.o
BENCHOBJ = ae.o anet.o redis-benchmark.o sds.o adlist.o zmalloc.o
CLIOBJ = anet.o sds.o adlist.o redis-cli.o zmalloc.o linenoise.o
CHECKDUMPOBJ = redis-check-dump.o lzf_c.o lzf_d.o
@@ -33,6 +33,7 @@ CHECKAOFPRGNAME = redis-check-aof
all: redis-server redis-benchmark redis-cli redis-check-dump redis-check-aof
+
# Deps (use make dep to generate this)
adlist.o: adlist.c adlist.h zmalloc.h
ae.o: ae.c ae.h zmalloc.h config.h ae_kqueue.c
@@ -40,22 +41,59 @@ ae_epoll.o: ae_epoll.c
ae_kqueue.o: ae_kqueue.c
ae_select.o: ae_select.c
anet.o: anet.c fmacros.h anet.h
+aof.o: aof.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+config.o: config.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+db.o: db.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+debug.o: debug.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h sha1.h
dict.o: dict.c fmacros.h dict.h zmalloc.h
+intset.o: intset.c intset.h zmalloc.h
linenoise.o: linenoise.c fmacros.h
lzf_c.o: lzf_c.c lzfP.h
lzf_d.o: lzf_d.c lzfP.h
+multi.o: multi.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+networking.o: networking.c redis.h fmacros.h config.h ae.h sds.h dict.h \
+ adlist.h zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+object.o: object.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
pqsort.o: pqsort.c
+pubsub.o: pubsub.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+rdb.o: rdb.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h lzf.h
redis-benchmark.o: redis-benchmark.c fmacros.h ae.h anet.h sds.h adlist.h \
zmalloc.h
redis-check-aof.o: redis-check-aof.c fmacros.h config.h
redis-check-dump.o: redis-check-dump.c lzf.h
-redis-cli.o: redis-cli.c fmacros.h anet.h sds.h adlist.h zmalloc.h \
- linenoise.h
-redis.o: redis.c fmacros.h config.h redis.h ae.h sds.h anet.h dict.h \
- adlist.h zmalloc.h lzf.h pqsort.h zipmap.h ziplist.h sha1.h
+redis-cli.o: redis-cli.c fmacros.h version.h anet.h sds.h adlist.h \
+ zmalloc.h linenoise.h
+redis.o: redis.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
release.o: release.c release.h
+replication.o: replication.c redis.h fmacros.h config.h ae.h sds.h dict.h \
+ adlist.h zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
sds.o: sds.c sds.h zmalloc.h
sha1.o: sha1.c sha1.h
+sort.o: sort.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h pqsort.h
+t_hash.o: t_hash.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+t_list.o: t_list.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+t_set.o: t_set.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+t_string.o: t_string.c redis.h fmacros.h config.h ae.h sds.h dict.h \
+ adlist.h zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+t_zset.o: t_zset.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+util.o: util.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
+vm.o: vm.c redis.h fmacros.h config.h ae.h sds.h dict.h adlist.h \
+ zmalloc.h anet.h zipmap.h ziplist.h intset.h version.h
ziplist.o: ziplist.c zmalloc.h ziplist.h
zipmap.o: zipmap.c zmalloc.h
zmalloc.o: zmalloc.c config.h
@@ -115,6 +153,7 @@ noopt:
make PROF="-pg" ARCH="-arch i386"
install: all
+ mkdir -p $(INSTALL_BIN)
$(INSTALL) $(PRGNAME) $(INSTALL_BIN)
$(INSTALL) $(BENCHPRGNAME) $(INSTALL_BIN)
$(INSTALL) $(CLIPRGNAME) $(INSTALL_BIN)
diff --git a/src/aof.c b/src/aof.c
index 942d4afd..eb67a7bd 100644
--- a/src/aof.c
+++ b/src/aof.c
@@ -189,6 +189,7 @@ struct redisClient *createFakeClient(void) {
c->querybuf = sdsempty();
c->argc = 0;
c->argv = NULL;
+ c->bufpos = 0;
c->flags = 0;
/* We set the fake client as a slave waiting for the synchronization
* so that Redis will not try to send replies to this client. */
@@ -272,12 +273,14 @@ int loadAppendOnlyFile(char *filename) {
fakeClient->argc = argc;
fakeClient->argv = argv;
cmd->proc(fakeClient);
- /* Discard the reply objects list from the fake client */
- while(listLength(fakeClient->reply))
- listDelNode(fakeClient->reply,listFirst(fakeClient->reply));
+
+ /* The fake client should not have a reply */
+ redisAssert(fakeClient->bufpos == 0 && listLength(fakeClient->reply) == 0);
+
/* Clean up, ready for the next command */
for (j = 0; j < argc; j++) decrRefCount(argv[j]);
zfree(argv);
+
/* Handle swapping while loading big datasets when VM is on */
force_swapout = 0;
if ((zmalloc_used_memory() - server.vm_max_memory) > 1024*1024*32)
@@ -307,7 +310,7 @@ readerr:
}
exit(1);
fmterr:
- redisLog(REDIS_WARNING,"Bad file format reading the append only file");
+ redisLog(REDIS_WARNING,"Bad file format reading the append only file: make a backup of your AOF file, then use ./redis-check-aof --fix
");
exit(1);
}
@@ -463,20 +466,30 @@ int rewriteAppendOnlyFile(char *filename) {
redisPanic("Unknown list encoding");
}
} else if (o->type == REDIS_SET) {
+ char cmd[]="*3\r\n$4\r\nSADD\r\n";
+
/* Emit the SADDs needed to rebuild the set */
- dict *set = o->ptr;
- dictIterator *di = dictGetIterator(set);
- dictEntry *de;
-
- while((de = dictNext(di)) != NULL) {
- char cmd[]="*3\r\n$4\r\nSADD\r\n";
- robj *eleobj = dictGetEntryKey(de);
-
- if (fwrite(cmd,sizeof(cmd)-1,1,fp) == 0) goto werr;
- if (fwriteBulkObject(fp,&key) == 0) goto werr;
- if (fwriteBulkObject(fp,eleobj) == 0) goto werr;
+ if (o->encoding == REDIS_ENCODING_INTSET) {
+ int ii = 0;
+ int64_t llval;
+ while(intsetGet(o->ptr,ii++,&llval)) {
+ if (fwrite(cmd,sizeof(cmd)-1,1,fp) == 0) goto werr;
+ if (fwriteBulkObject(fp,&key) == 0) goto werr;
+ if (fwriteBulkLongLong(fp,llval) == 0) goto werr;
+ }
+ } else if (o->encoding == REDIS_ENCODING_HT) {
+ dictIterator *di = dictGetIterator(o->ptr);
+ dictEntry *de;
+ while((de = dictNext(di)) != NULL) {
+ robj *eleobj = dictGetEntryKey(de);
+ if (fwrite(cmd,sizeof(cmd)-1,1,fp) == 0) goto werr;
+ if (fwriteBulkObject(fp,&key) == 0) goto werr;
+ if (fwriteBulkObject(fp,eleobj) == 0) goto werr;
+ }
+ dictReleaseIterator(di);
+ } else {
+ redisPanic("Unknown set encoding");
}
- dictReleaseIterator(di);
} else if (o->type == REDIS_ZSET) {
/* Emit the ZADDs needed to rebuild the sorted set */
zset *zs = o->ptr;
@@ -620,12 +633,11 @@ int rewriteAppendOnlyFileBackground(void) {
void bgrewriteaofCommand(redisClient *c) {
if (server.bgrewritechildpid != -1) {
- addReplySds(c,sdsnew("-ERR background append only file rewriting already in progress\r\n"));
+ addReplyError(c,"Background append only file rewriting already in progress");
return;
}
if (rewriteAppendOnlyFileBackground() == REDIS_OK) {
- char *status = "+Background append only file rewriting started\r\n";
- addReplySds(c,sdsnew(status));
+ addReplyStatus(c,"Background append only file rewriting started");
} else {
addReply(c,shared.err);
}
diff --git a/src/config.c b/src/config.c
index eeec9e8c..4257fc36 100644
--- a/src/config.c
+++ b/src/config.c
@@ -201,6 +201,8 @@ void loadServerConfig(char *filename) {
server.list_max_ziplist_entries = memtoll(argv[1], NULL);
} else if (!strcasecmp(argv[0],"list-max-ziplist-value") && argc == 2){
server.list_max_ziplist_value = memtoll(argv[1], NULL);
+ } else if (!strcasecmp(argv[0],"set-max-intset-entries") && argc == 2){
+ server.set_max_intset_entries = memtoll(argv[1], NULL);
} else {
err = "Bad directive or wrong number of arguments"; goto loaderr;
}
@@ -241,6 +243,7 @@ void configSetCommand(redisClient *c) {
if (getLongLongFromObject(o,&ll) == REDIS_ERR ||
ll < 0) goto badfmt;
server.maxmemory = ll;
+ if (server.maxmemory) freeMemoryIfNeeded();
} else if (!strcasecmp(c->argv[2]->ptr,"timeout")) {
if (getLongLongFromObject(o,&ll) == REDIS_ERR ||
ll < 0 || ll > LONG_MAX) goto badfmt;
@@ -270,8 +273,8 @@ void configSetCommand(redisClient *c) {
stopAppendOnly();
} else {
if (startAppendOnly() == REDIS_ERR) {
- addReplySds(c,sdscatprintf(sdsempty(),
- "-ERR Unable to turn on AOF. Check server logs.\r\n"));
+ addReplyError(c,
+ "Unable to turn on AOF. Check server logs.");
decrRefCount(o);
return;
}
@@ -312,9 +315,8 @@ void configSetCommand(redisClient *c) {
}
sdsfreesplitres(v,vlen);
} else {
- addReplySds(c,sdscatprintf(sdsempty(),
- "-ERR not supported CONFIG parameter %s\r\n",
- (char*)c->argv[2]->ptr));
+ addReplyErrorFormat(c,"Unsupported CONFIG parameter: %s",
+ (char*)c->argv[2]->ptr);
decrRefCount(o);
return;
}
@@ -323,22 +325,18 @@ void configSetCommand(redisClient *c) {
return;
badfmt: /* Bad format errors */
- addReplySds(c,sdscatprintf(sdsempty(),
- "-ERR invalid argument '%s' for CONFIG SET '%s'\r\n",
+ addReplyErrorFormat(c,"Invalid argument '%s' for CONFIG SET '%s'",
(char*)o->ptr,
- (char*)c->argv[2]->ptr));
+ (char*)c->argv[2]->ptr);
decrRefCount(o);
}
void configGetCommand(redisClient *c) {
robj *o = getDecodedObject(c->argv[2]);
- robj *lenobj = createObject(REDIS_STRING,NULL);
+ void *replylen = addDeferredMultiBulkLength(c);
char *pattern = o->ptr;
int matches = 0;
- addReply(c,lenobj);
- decrRefCount(lenobj);
-
if (stringmatch(pattern,"dbfilename",0)) {
addReplyBulkCString(c,"dbfilename");
addReplyBulkCString(c,server.dbfilename);
@@ -410,7 +408,7 @@ void configGetCommand(redisClient *c) {
matches++;
}
decrRefCount(o);
- lenobj->ptr = sdscatprintf(sdsempty(),"*%d\r\n",matches*2);
+ setDeferredMultiBulkLength(c,replylen,matches*2);
}
void configCommand(redisClient *c) {
@@ -428,13 +426,12 @@ void configCommand(redisClient *c) {
server.stat_starttime = time(NULL);
addReply(c,shared.ok);
} else {
- addReplySds(c,sdscatprintf(sdsempty(),
- "-ERR CONFIG subcommand must be one of GET, SET, RESETSTAT\r\n"));
+ addReplyError(c,
+ "CONFIG subcommand must be one of GET, SET, RESETSTAT");
}
return;
badarity:
- addReplySds(c,sdscatprintf(sdsempty(),
- "-ERR Wrong number of arguments for CONFIG %s\r\n",
- (char*) c->argv[1]->ptr));
+ addReplyErrorFormat(c,"Wrong number of arguments for CONFIG %s",
+ (char*) c->argv[1]->ptr);
}
diff --git a/src/config.h b/src/config.h
index 6e98fbb2..e2d84818 100644
--- a/src/config.h
+++ b/src/config.h
@@ -21,6 +21,16 @@
#define redis_stat stat
#endif
+/* test for proc filesystem */
+#ifdef __linux__
+#define HAVE_PROCFS 1
+#endif
+
+/* test for task_info() */
+#if defined(__APPLE__)
+#define HAVE_TASKINFO 1
+#endif
+
/* test for backtrace() */
#if defined(__APPLE__) || defined(__linux__)
#define HAVE_BACKTRACE 1
diff --git a/src/db.c b/src/db.c
index 958a9f6b..44507847 100644
--- a/src/db.c
+++ b/src/db.c
@@ -45,7 +45,7 @@ robj *lookupKeyRead(redisDb *db, robj *key) {
}
robj *lookupKeyWrite(redisDb *db, robj *key) {
- deleteIfVolatile(db,key);
+ expireIfNeeded(db,key);
return lookupKey(db,key);
}
@@ -123,6 +123,11 @@ robj *dbRandomKey(redisDb *db) {
/* Delete a key, value, and associated expiration entry if any, from the DB */
int dbDelete(redisDb *db, robj *key) {
+ /* If VM is enabled make sure to awake waiting clients for this key:
+ * deleting the key will kill the I/O thread bringing the key from swap
+ * to memory, so the client will never be notified and unblocked if we
+ * don't do it now. */
+ if (server.vm_enabled) handleClientsBlockedOnSwappedKey(db,key);
/* Deleting an entry from the expires dict will not free the sds of
* the key, because it is shared with the main dictionary. */
if (dictSize(db->expires) > 0) dictDelete(db->expires,key->ptr);
@@ -199,7 +204,7 @@ void selectCommand(redisClient *c) {
int id = atoi(c->argv[1]->ptr);
if (selectDb(c,id) == REDIS_ERR) {
- addReplySds(c,sdsnew("-ERR invalid DB index\r\n"));
+ addReplyError(c,"invalid DB index");
} else {
addReply(c,shared.ok);
}
@@ -221,19 +226,17 @@ void keysCommand(redisClient *c) {
dictIterator *di;
dictEntry *de;
sds pattern = c->argv[1]->ptr;
- int plen = sdslen(pattern);
+ int plen = sdslen(pattern), allkeys;
unsigned long numkeys = 0;
- robj *lenobj = createObject(REDIS_STRING,NULL);
+ void *replylen = addDeferredMultiBulkLength(c);
di = dictGetIterator(c->db->dict);
- addReply(c,lenobj);
- decrRefCount(lenobj);
+ allkeys = (pattern[0] == '*' && pattern[1] == '\0');
while((de = dictNext(di)) != NULL) {
sds key = dictGetEntryKey(de);
robj *keyobj;
- if ((pattern[0] == '*' && pattern[1] == '\0') ||
- stringmatchlen(pattern,plen,key,sdslen(key),0)) {
+ if (allkeys || stringmatchlen(pattern,plen,key,sdslen(key),0)) {
keyobj = createStringObject(key,sdslen(key));
if (expireIfNeeded(c->db,keyobj) == 0) {
addReplyBulk(c,keyobj);
@@ -243,17 +246,15 @@ void keysCommand(redisClient *c) {
}
}
dictReleaseIterator(di);
- lenobj->ptr = sdscatprintf(sdsempty(),"*%lu\r\n",numkeys);
+ setDeferredMultiBulkLength(c,replylen,numkeys);
}
void dbsizeCommand(redisClient *c) {
- addReplySds(c,
- sdscatprintf(sdsempty(),":%lu\r\n",dictSize(c->db->dict)));
+ addReplyLongLong(c,dictSize(c->db->dict));
}
void lastsaveCommand(redisClient *c) {
- addReplySds(c,
- sdscatprintf(sdsempty(),":%lu\r\n",server.lastsave));
+ addReplyLongLong(c,server.lastsave);
}
void typeCommand(redisClient *c) {
@@ -262,24 +263,23 @@ void typeCommand(redisClient *c) {
o = lookupKeyRead(c->db,c->argv[1]);
if (o == NULL) {
- type = "+none";
+ type = "none";
} else {
switch(o->type) {
- case REDIS_STRING: type = "+string"; break;
- case REDIS_LIST: type = "+list"; break;
- case REDIS_SET: type = "+set"; break;
- case REDIS_ZSET: type = "+zset"; break;
- case REDIS_HASH: type = "+hash"; break;
- default: type = "+unknown"; break;
+ case REDIS_STRING: type = "string"; break;
+ case REDIS_LIST: type = "list"; break;
+ case REDIS_SET: type = "set"; break;
+ case REDIS_ZSET: type = "zset"; break;
+ case REDIS_HASH: type = "hash"; break;
+ default: type = "unknown"; break;
}
}
- addReplySds(c,sdsnew(type));
- addReply(c,shared.crlf);
+ addReplyStatus(c,type);
}
void saveCommand(redisClient *c) {
if (server.bgsavechildpid != -1) {
- addReplySds(c,sdsnew("-ERR background save in progress\r\n"));
+ addReplyError(c,"Background save already in progress");
return;
}
if (rdbSave(server.dbfilename) == REDIS_OK) {
@@ -291,12 +291,11 @@ void saveCommand(redisClient *c) {
void bgsaveCommand(redisClient *c) {
if (server.bgsavechildpid != -1) {
- addReplySds(c,sdsnew("-ERR background save already in progress\r\n"));
+ addReplyError(c,"Background save already in progress");
return;
}
if (rdbSaveBackground(server.dbfilename) == REDIS_OK) {
- char *status = "+Background saving started\r\n";
- addReplySds(c,sdsnew(status));
+ addReplyStatus(c,"Background saving started");
} else {
addReply(c,shared.err);
}
@@ -305,7 +304,7 @@ void bgsaveCommand(redisClient *c) {
void shutdownCommand(redisClient *c) {
if (prepareForShutdown() == REDIS_OK)
exit(0);
- addReplySds(c, sdsnew("-ERR Errors trying to SHUTDOWN. Check logs.\r\n"));
+ addReplyError(c,"Errors trying to SHUTDOWN. Check logs.");
}
void renameGenericCommand(redisClient *c, int nx) {
@@ -321,7 +320,6 @@ void renameGenericCommand(redisClient *c, int nx) {
return;
incrRefCount(o);
- deleteIfVolatile(c->db,c->argv[2]);
if (dbAdd(c->db,c->argv[2],o) == REDIS_ERR) {
if (nx) {
decrRefCount(o);
@@ -375,7 +373,6 @@ void moveCommand(redisClient *c) {
}
/* Try to add the element to the target DB */
- deleteIfVolatile(dst,c->argv[1]);
if (dbAdd(dst,c->argv[1],o) == REDIS_ERR) {
addReply(c,shared.czero);
return;
@@ -396,23 +393,16 @@ int removeExpire(redisDb *db, robj *key) {
/* An expire may only be removed if there is a corresponding entry in the
* main dict. Otherwise, the key will never be freed. */
redisAssert(dictFind(db->dict,key->ptr) != NULL);
- if (dictDelete(db->expires,key->ptr) == DICT_OK) {
- return 1;
- } else {
- return 0;
- }
+ return dictDelete(db->expires,key->ptr) == DICT_OK;
}
-int setExpire(redisDb *db, robj *key, time_t when) {
+void setExpire(redisDb *db, robj *key, time_t when) {
dictEntry *de;
/* Reuse the sds from the main dict in the expire dict */
- redisAssert((de = dictFind(db->dict,key->ptr)) != NULL);
- if (dictAdd(db->expires,dictGetEntryKey(de),(void*)when) == DICT_ERR) {
- return 0;
- } else {
- return 1;
- }
+ de = dictFind(db->dict,key->ptr);
+ redisAssert(de != NULL);
+ dictReplace(db->expires,dictGetEntryKey(de),(void*)when);
}
/* Return the expire time of the specified key, or -1 if no expire
@@ -430,8 +420,46 @@ time_t getExpire(redisDb *db, robj *key) {
return (time_t) dictGetEntryVal(de);
}
+/* Propagate expires into slaves and the AOF file.
+ * When a key expires in the master, a DEL operation for this key is sent
+ * to all the slaves and the AOF file if enabled.
+ *
+ * This way the key expiry is centralized in one place, and since both
+ * AOF and the master->slave link guarantee operation ordering, everything
+ * will be consistent even if we allow write operations against expiring
+ * keys. */
+void propagateExpire(redisDb *db, robj *key) {
+ struct redisCommand *cmd;
+ robj *argv[2];
+
+ cmd = lookupCommand("del");
+ argv[0] = createStringObject("DEL",3);
+ argv[1] = key;
+ incrRefCount(key);
+
+ if (server.appendonly)
+ feedAppendOnlyFile(cmd,db->id,argv,2);
+ if (listLength(server.slaves))
+ replicationFeedSlaves(server.slaves,db->id,argv,2);
+
+ decrRefCount(argv[0]);
+ decrRefCount(argv[1]);
+}
+
int expireIfNeeded(redisDb *db, robj *key) {
time_t when = getExpire(db,key);
+
+ /* If we are running in the context of a slave, return ASAP:
+ * the slave key expiration is controlled by the master that will
+ * send us synthesized DEL operations for expired keys.
+ *
+ * Still we try to return the right information to the caller,
+ * that is, 0 if we think the key should be still valid, 1 if
+ * we think the key is expired at this time. */
+ if (server.masterhost != NULL) {
+ return time(NULL) > when;
+ }
+
if (when < 0) return 0;
/* Return when this key has not expired */
@@ -440,15 +468,7 @@ int expireIfNeeded(redisDb *db, robj *key) {
/* Delete the key */
server.stat_expiredkeys++;
server.dirty++;
- return dbDelete(db,key);
-}
-
-int deleteIfVolatile(redisDb *db, robj *key) {
- if (getExpire(db,key) < 0) return 0;
-
- /* Delete the key */
- server.stat_expiredkeys++;
- server.dirty++;
+ propagateExpire(db,key);
return dbDelete(db,key);
}
@@ -458,7 +478,7 @@ int deleteIfVolatile(redisDb *db, robj *key) {
void expireGenericCommand(redisClient *c, robj *key, robj *param, long offset) {
dictEntry *de;
- time_t seconds;
+ long seconds;
if (getLongFromObjectOrReply(c, param, &seconds, NULL) != REDIS_OK) return;
@@ -476,13 +496,10 @@ void expireGenericCommand(redisClient *c, robj *key, robj *param, long offset) {
return;
} else {
time_t when = time(NULL)+seconds;
- if (setExpire(c->db,key,when)) {
- addReply(c,shared.cone);
- touchWatchedKey(c->db,key);
- server.dirty++;
- } else {
- addReply(c,shared.czero);
- }
+ setExpire(c->db,key,when);
+ addReply(c,shared.cone);
+ touchWatchedKey(c->db,key);
+ server.dirty++;
return;
}
}
@@ -496,13 +513,28 @@ void expireatCommand(redisClient *c) {
}
void ttlCommand(redisClient *c) {
- time_t expire;
- int ttl = -1;
+ time_t expire, ttl = -1;
expire = getExpire(c->db,c->argv[1]);
if (expire != -1) {
- ttl = (int) (expire-time(NULL));
+ ttl = (expire-time(NULL));
if (ttl < 0) ttl = -1;
}
- addReplySds(c,sdscatprintf(sdsempty(),":%d\r\n",ttl));
+ addReplyLongLong(c,(long long)ttl);
+}
+
+void persistCommand(redisClient *c) {
+ dictEntry *de;
+
+ de = dictFind(c->db->dict,c->argv[1]->ptr);
+ if (de == NULL) {
+ addReply(c,shared.czero);
+ } else {
+ if (removeExpire(c->db,c->argv[1])) {
+ addReply(c,shared.cone);
+ server.dirty++;
+ } else {
+ addReply(c,shared.czero);
+ }
+ }
}
diff --git a/src/debug.c b/src/debug.c
index ba183d72..2f7ab58f 100644
--- a/src/debug.c
+++ b/src/debug.c
@@ -119,16 +119,13 @@ void computeDatasetDigest(unsigned char *final) {
}
listTypeReleaseIterator(li);
} else if (o->type == REDIS_SET) {
- dict *set = o->ptr;
- dictIterator *di = dictGetIterator(set);
- dictEntry *de;
-
- while((de = dictNext(di)) != NULL) {
- robj *eleobj = dictGetEntryKey(de);
-
- xorObjectDigest(digest,eleobj);
+ setTypeIterator *si = setTypeInitIterator(o);
+ robj *ele;
+ while((ele = setTypeNext(si)) != NULL) {
+ xorObjectDigest(digest,ele);
+ decrRefCount(ele);
}
- dictReleaseIterator(di);
+ setTypeReleaseIterator(si);
} else if (o->type == REDIS_ZSET) {
zset *zs = o->ptr;
dictIterator *di = dictGetIterator(zs->dict);
@@ -214,18 +211,18 @@ void debugCommand(redisClient *c) {
char *strenc;
strenc = strEncoding(val->encoding);
- addReplySds(c,sdscatprintf(sdsempty(),
- "+Value at:%p refcount:%d "
- "encoding:%s serializedlength:%lld\r\n",
+ addReplyStatusFormat(c,
+ "Value at:%p refcount:%d "
+ "encoding:%s serializedlength:%lld",
(void*)val, val->refcount,
- strenc, (long long) rdbSavedObjectLen(val,NULL)));
+ strenc, (long long) rdbSavedObjectLen(val,NULL));
} else {
vmpointer *vp = (vmpointer*) val;
- addReplySds(c,sdscatprintf(sdsempty(),
- "+Value swapped at: page %llu "
- "using %llu pages\r\n",
+ addReplyStatusFormat(c,
+ "Value swapped at: page %llu "
+ "using %llu pages",
(unsigned long long) vp->page,
- (unsigned long long) vp->usedpages));
+ (unsigned long long) vp->usedpages);
}
} else if (!strcasecmp(c->argv[1]->ptr,"swapin") && c->argc == 3) {
lookupKeyRead(c->db,c->argv[2]);
@@ -236,7 +233,7 @@ void debugCommand(redisClient *c) {
vmpointer *vp;
if (!server.vm_enabled) {
- addReplySds(c,sdsnew("-ERR Virtual Memory is disabled\r\n"));
+ addReplyError(c,"Virtual Memory is disabled");
return;
}
if (!de) {
@@ -246,9 +243,9 @@ void debugCommand(redisClient *c) {
val = dictGetEntryVal(de);
/* Swap it */
if (val->storage != REDIS_VM_MEMORY) {
- addReplySds(c,sdsnew("-ERR This key is not in memory\r\n"));
+ addReplyError(c,"This key is not in memory");
} else if (val->refcount != 1) {
- addReplySds(c,sdsnew("-ERR Object is shared\r\n"));
+ addReplyError(c,"Object is shared");
} else if ((vp = vmSwapObjectBlocking(val)) != NULL) {
dictGetEntryVal(de) = vp;
addReply(c,shared.ok);
@@ -277,18 +274,17 @@ void debugCommand(redisClient *c) {
addReply(c,shared.ok);
} else if (!strcasecmp(c->argv[1]->ptr,"digest") && c->argc == 2) {
unsigned char digest[20];
- sds d = sdsnew("+");
+ sds d = sdsempty();
int j;
computeDatasetDigest(digest);
for (j = 0; j < 20; j++)
d = sdscatprintf(d, "%02x",digest[j]);
-
- d = sdscatlen(d,"\r\n",2);
- addReplySds(c,d);
+ addReplyStatus(c,d);
+ sdsfree(d);
} else {
- addReplySds(c,sdsnew(
- "-ERR Syntax error, try DEBUG [SEGFAULT|OBJECT |SWAPIN |SWAPOUT |RELOAD]\r\n"));
+ addReplyError(c,
+ "Syntax error, try DEBUG [SEGFAULT|OBJECT |SWAPIN |SWAPOUT |RELOAD]");
}
}
diff --git a/src/dict.c b/src/dict.c
index 2d1e752b..a1060d45 100644
--- a/src/dict.c
+++ b/src/dict.c
@@ -49,8 +49,13 @@
/* Using dictEnableResize() / dictDisableResize() we make possible to
* enable/disable resizing of the hash table as needed. This is very important
* for Redis, as we use copy-on-write and don't want to move too much memory
- * around when there is a child performing saving operations. */
+ * around when there is a child performing saving operations.
+ *
+ * Note that even when dict_can_resize is set to 0, not all resizes are
+ * prevented: an hash table is still allowed to grow if the ratio between
+ * the number of elements and the buckets > dict_force_resize_ratio. */
static int dict_can_resize = 1;
+static unsigned int dict_force_resize_ratio = 5;
/* -------------------------- private prototypes ---------------------------- */
@@ -125,7 +130,7 @@ int _dictInit(dict *d, dictType *type,
}
/* Resize the table to the minimal size that contains all the elements,
- * but with the invariant of a USER/BUCKETS ration near to <= 1 */
+ * but with the invariant of a USER/BUCKETS ratio near to <= 1 */
int dictResize(dict *d)
{
int minimal;
@@ -493,14 +498,23 @@ dictEntry *dictGetRandomKey(dict *d)
/* Expand the hash table if needed */
static int _dictExpandIfNeeded(dict *d)
{
- /* If the hash table is empty expand it to the intial size,
- * if the table is "full" dobule its size. */
+ /* Incremental rehashing already in progress. Return. */
if (dictIsRehashing(d)) return DICT_OK;
- if (d->ht[0].size == 0)
- return dictExpand(d, DICT_HT_INITIAL_SIZE);
- if (d->ht[0].used >= d->ht[0].size && dict_can_resize)
+
+ /* If the hash table is empty expand it to the intial size. */
+ if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);
+
+ /* If we reached the 1:1 ratio, and we are allowed to resize the hash
+ * table (global setting) or we should avoid it but the ratio between
+ * elements/buckets is over the "safe" threshold, we resize doubling
+ * the number of buckets. */
+ if (d->ht[0].used >= d->ht[0].size &&
+ (dict_can_resize ||
+ d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
+ {
return dictExpand(d, ((d->ht[0].size > d->ht[0].used) ?
d->ht[0].size : d->ht[0].used)*2);
+ }
return DICT_OK;
}
diff --git a/src/intset.c b/src/intset.c
new file mode 100644
index 00000000..2f359b7f
--- /dev/null
+++ b/src/intset.c
@@ -0,0 +1,422 @@
+#include
+#include
+#include
+#include "intset.h"
+#include "zmalloc.h"
+
+/* Note that these encodings are ordered, so:
+ * INTSET_ENC_INT16 < INTSET_ENC_INT32 < INTSET_ENC_INT64. */
+#define INTSET_ENC_INT16 (sizeof(int16_t))
+#define INTSET_ENC_INT32 (sizeof(int32_t))
+#define INTSET_ENC_INT64 (sizeof(int64_t))
+
+/* Return the required encoding for the provided value. */
+static uint8_t _intsetValueEncoding(int64_t v) {
+ if (v < INT32_MIN || v > INT32_MAX)
+ return INTSET_ENC_INT64;
+ else if (v < INT16_MIN || v > INT16_MAX)
+ return INTSET_ENC_INT32;
+ return INTSET_ENC_INT16;
+}
+
+/* Return the value at pos, given an encoding. */
+static int64_t _intsetGetEncoded(intset *is, int pos, uint8_t enc) {
+ if (enc == INTSET_ENC_INT64)
+ return ((int64_t*)is->contents)[pos];
+ else if (enc == INTSET_ENC_INT32)
+ return ((int32_t*)is->contents)[pos];
+ return ((int16_t*)is->contents)[pos];
+}
+
+/* Return the value at pos, using the configured encoding. */
+static int64_t _intsetGet(intset *is, int pos) {
+ return _intsetGetEncoded(is,pos,is->encoding);
+}
+
+/* Set the value at pos, using the configured encoding. */
+static void _intsetSet(intset *is, int pos, int64_t value) {
+ if (is->encoding == INTSET_ENC_INT64)
+ ((int64_t*)is->contents)[pos] = value;
+ else if (is->encoding == INTSET_ENC_INT32)
+ ((int32_t*)is->contents)[pos] = value;
+ else
+ ((int16_t*)is->contents)[pos] = value;
+}
+
+/* Create an empty intset. */
+intset *intsetNew(void) {
+ intset *is = zmalloc(sizeof(intset));
+ is->encoding = INTSET_ENC_INT16;
+ is->length = 0;
+ return is;
+}
+
+/* Resize the intset */
+static intset *intsetResize(intset *is, uint32_t len) {
+ uint32_t size = len*is->encoding;
+ is = zrealloc(is,sizeof(intset)+size);
+ return is;
+}
+
+/* Search for the position of "value". Return 1 when the value was found and
+ * sets "pos" to the position of the value within the intset. Return 0 when
+ * the value is not present in the intset and sets "pos" to the position
+ * where "value" can be inserted. */
+static uint8_t intsetSearch(intset *is, int64_t value, uint32_t *pos) {
+ int min = 0, max = is->length-1, mid = -1;
+ int64_t cur = -1;
+
+ /* The value can never be found when the set is empty */
+ if (is->length == 0) {
+ if (pos) *pos = 0;
+ return 0;
+ } else {
+ /* Check for the case where we know we cannot find the value,
+ * but do know the insert position. */
+ if (value > _intsetGet(is,is->length-1)) {
+ if (pos) *pos = is->length;
+ return 0;
+ } else if (value < _intsetGet(is,0)) {
+ if (pos) *pos = 0;
+ return 0;
+ }
+ }
+
+ while(max >= min) {
+ mid = (min+max)/2;
+ cur = _intsetGet(is,mid);
+ if (value > cur) {
+ min = mid+1;
+ } else if (value < cur) {
+ max = mid-1;
+ } else {
+ break;
+ }
+ }
+
+ if (value == cur) {
+ if (pos) *pos = mid;
+ return 1;
+ } else {
+ if (pos) *pos = min;
+ return 0;
+ }
+}
+
+/* Upgrades the intset to a larger encoding and inserts the given integer. */
+static intset *intsetUpgradeAndAdd(intset *is, int64_t value) {
+ uint8_t curenc = is->encoding;
+ uint8_t newenc = _intsetValueEncoding(value);
+ int length = is->length;
+ int prepend = value < 0 ? 1 : 0;
+
+ /* First set new encoding and resize */
+ is->encoding = newenc;
+ is = intsetResize(is,is->length+1);
+
+ /* Upgrade back-to-front so we don't overwrite values.
+ * Note that the "prepend" variable is used to make sure we have an empty
+ * space at either the beginning or the end of the intset. */
+ while(length--)
+ _intsetSet(is,length+prepend,_intsetGetEncoded(is,length,curenc));
+
+ /* Set the value at the beginning or the end. */
+ if (prepend)
+ _intsetSet(is,0,value);
+ else
+ _intsetSet(is,is->length,value);
+ is->length++;
+ return is;
+}
+
+static void intsetMoveTail(intset *is, uint32_t from, uint32_t to) {
+ void *src, *dst;
+ uint32_t bytes = is->length-from;
+ if (is->encoding == INTSET_ENC_INT64) {
+ src = (int64_t*)is->contents+from;
+ dst = (int64_t*)is->contents+to;
+ bytes *= sizeof(int64_t);
+ } else if (is->encoding == INTSET_ENC_INT32) {
+ src = (int32_t*)is->contents+from;
+ dst = (int32_t*)is->contents+to;
+ bytes *= sizeof(int32_t);
+ } else {
+ src = (int16_t*)is->contents+from;
+ dst = (int16_t*)is->contents+to;
+ bytes *= sizeof(int16_t);
+ }
+ memmove(dst,src,bytes);
+}
+
+/* Insert an integer in the intset */
+intset *intsetAdd(intset *is, int64_t value, uint8_t *success) {
+ uint8_t valenc = _intsetValueEncoding(value);
+ uint32_t pos;
+ if (success) *success = 1;
+
+ /* Upgrade encoding if necessary. If we need to upgrade, we know that
+ * this value should be either appended (if > 0) or prepended (if < 0),
+ * because it lies outside the range of existing values. */
+ if (valenc > is->encoding) {
+ /* This always succeeds, so we don't need to curry *success. */
+ return intsetUpgradeAndAdd(is,value);
+ } else {
+ /* Abort if the value is already present in the set.
+ * This call will populate "pos" with the right position to insert
+ * the value when it cannot be found. */
+ if (intsetSearch(is,value,&pos)) {
+ if (success) *success = 0;
+ return is;
+ }
+
+ is = intsetResize(is,is->length+1);
+ if (pos < is->length) intsetMoveTail(is,pos,pos+1);
+ }
+
+ _intsetSet(is,pos,value);
+ is->length++;
+ return is;
+}
+
+/* Delete integer from intset */
+intset *intsetRemove(intset *is, int64_t value, uint8_t *success) {
+ uint8_t valenc = _intsetValueEncoding(value);
+ uint32_t pos;
+ if (success) *success = 0;
+
+ if (valenc <= is->encoding && intsetSearch(is,value,&pos)) {
+ /* We know we can delete */
+ if (success) *success = 1;
+
+ /* Overwrite value with tail and update length */
+ if (pos < (is->length-1)) intsetMoveTail(is,pos+1,pos);
+ is = intsetResize(is,is->length-1);
+ is->length--;
+ }
+ return is;
+}
+
+/* Determine whether a value belongs to this set */
+uint8_t intsetFind(intset *is, int64_t value) {
+ uint8_t valenc = _intsetValueEncoding(value);
+ return valenc <= is->encoding && intsetSearch(is,value,NULL);
+}
+
+/* Return random member */
+int64_t intsetRandom(intset *is) {
+ return _intsetGet(is,rand()%is->length);
+}
+
+/* Sets the value to the value at the given position. When this position is
+ * out of range the function returns 0, when in range it returns 1. */
+uint8_t intsetGet(intset *is, uint32_t pos, int64_t *value) {
+ if (pos < is->length) {
+ *value = _intsetGet(is,pos);
+ return 1;
+ }
+ return 0;
+}
+
+/* Return intset length */
+uint32_t intsetLen(intset *is) {
+ return is->length;
+}
+
+#ifdef INTSET_TEST_MAIN
+#include
+
+void intsetRepr(intset *is) {
+ int i;
+ for (i = 0; i < is->length; i++) {
+ printf("%lld\n", (uint64_t)_intsetGet(is,i));
+ }
+ printf("\n");
+}
+
+void error(char *err) {
+ printf("%s\n", err);
+ exit(1);
+}
+
+void ok(void) {
+ printf("OK\n");
+}
+
+long long usec(void) {
+ struct timeval tv;
+ gettimeofday(&tv,NULL);
+ return (((long long)tv.tv_sec)*1000000)+tv.tv_usec;
+}
+
+#define assert(_e) ((_e)?(void)0:(_assert(#_e,__FILE__,__LINE__),exit(1)))
+void _assert(char *estr, char *file, int line) {
+ printf("\n\n=== ASSERTION FAILED ===\n");
+ printf("==> %s:%d '%s' is not true\n",file,line,estr);
+}
+
+intset *createSet(int bits, int size) {
+ uint64_t mask = (1< 32) {
+ value = (rand()*rand()) & mask;
+ } else {
+ value = rand() & mask;
+ }
+ is = intsetAdd(is,value,NULL);
+ }
+ return is;
+}
+
+void checkConsistency(intset *is) {
+ int i;
+
+ for (i = 0; i < (is->length-1); i++) {
+ if (is->encoding == INTSET_ENC_INT16) {
+ int16_t *i16 = (int16_t*)is->contents;
+ assert(i16[i] < i16[i+1]);
+ } else if (is->encoding == INTSET_ENC_INT32) {
+ int32_t *i32 = (int32_t*)is->contents;
+ assert(i32[i] < i32[i+1]);
+ } else {
+ int64_t *i64 = (int64_t*)is->contents;
+ assert(i64[i] < i64[i+1]);
+ }
+ }
+}
+
+int main(int argc, char **argv) {
+ uint8_t success;
+ int i;
+ intset *is;
+ sranddev();
+
+ printf("Value encodings: "); {
+ assert(_intsetValueEncoding(-32768) == INTSET_ENC_INT16);
+ assert(_intsetValueEncoding(+32767) == INTSET_ENC_INT16);
+ assert(_intsetValueEncoding(-32769) == INTSET_ENC_INT32);
+ assert(_intsetValueEncoding(+32768) == INTSET_ENC_INT32);
+ assert(_intsetValueEncoding(-2147483648) == INTSET_ENC_INT32);
+ assert(_intsetValueEncoding(+2147483647) == INTSET_ENC_INT32);
+ assert(_intsetValueEncoding(-2147483649) == INTSET_ENC_INT64);
+ assert(_intsetValueEncoding(+2147483648) == INTSET_ENC_INT64);
+ assert(_intsetValueEncoding(-9223372036854775808ull) == INTSET_ENC_INT64);
+ assert(_intsetValueEncoding(+9223372036854775807ull) == INTSET_ENC_INT64);
+ ok();
+ }
+
+ printf("Basic adding: "); {
+ is = intsetNew();
+ is = intsetAdd(is,5,&success); assert(success);
+ is = intsetAdd(is,6,&success); assert(success);
+ is = intsetAdd(is,4,&success); assert(success);
+ is = intsetAdd(is,4,&success); assert(!success);
+ ok();
+ }
+
+ printf("Large number of random adds: "); {
+ int inserts = 0;
+ is = intsetNew();
+ for (i = 0; i < 1024; i++) {
+ is = intsetAdd(is,rand()%0x800,&success);
+ if (success) inserts++;
+ }
+ assert(is->length == inserts);
+ checkConsistency(is);
+ ok();
+ }
+
+ printf("Upgrade from int16 to int32: "); {
+ is = intsetNew();
+ is = intsetAdd(is,32,NULL);
+ assert(is->encoding == INTSET_ENC_INT16);
+ is = intsetAdd(is,65535,NULL);
+ assert(is->encoding == INTSET_ENC_INT32);
+ assert(intsetFind(is,32));
+ assert(intsetFind(is,65535));
+ checkConsistency(is);
+
+ is = intsetNew();
+ is = intsetAdd(is,32,NULL);
+ assert(is->encoding == INTSET_ENC_INT16);
+ is = intsetAdd(is,-65535,NULL);
+ assert(is->encoding == INTSET_ENC_INT32);
+ assert(intsetFind(is,32));
+ assert(intsetFind(is,-65535));
+ checkConsistency(is);
+ ok();
+ }
+
+ printf("Upgrade from int16 to int64: "); {
+ is = intsetNew();
+ is = intsetAdd(is,32,NULL);
+ assert(is->encoding == INTSET_ENC_INT16);
+ is = intsetAdd(is,4294967295,NULL);
+ assert(is->encoding == INTSET_ENC_INT64);
+ assert(intsetFind(is,32));
+ assert(intsetFind(is,4294967295));
+ checkConsistency(is);
+
+ is = intsetNew();
+ is = intsetAdd(is,32,NULL);
+ assert(is->encoding == INTSET_ENC_INT16);
+ is = intsetAdd(is,-4294967295,NULL);
+ assert(is->encoding == INTSET_ENC_INT64);
+ assert(intsetFind(is,32));
+ assert(intsetFind(is,-4294967295));
+ checkConsistency(is);
+ ok();
+ }
+
+ printf("Upgrade from int32 to int64: "); {
+ is = intsetNew();
+ is = intsetAdd(is,65535,NULL);
+ assert(is->encoding == INTSET_ENC_INT32);
+ is = intsetAdd(is,4294967295,NULL);
+ assert(is->encoding == INTSET_ENC_INT64);
+ assert(intsetFind(is,65535));
+ assert(intsetFind(is,4294967295));
+ checkConsistency(is);
+
+ is = intsetNew();
+ is = intsetAdd(is,65535,NULL);
+ assert(is->encoding == INTSET_ENC_INT32);
+ is = intsetAdd(is,-4294967295,NULL);
+ assert(is->encoding == INTSET_ENC_INT64);
+ assert(intsetFind(is,65535));
+ assert(intsetFind(is,-4294967295));
+ checkConsistency(is);
+ ok();
+ }
+
+ printf("Stress lookups: "); {
+ long num = 100000, size = 10000;
+ int i, bits = 20;
+ long long start;
+ is = createSet(bits,size);
+ checkConsistency(is);
+
+ start = usec();
+ for (i = 0; i < num; i++) intsetSearch(is,rand() % ((1<
+
+typedef struct intset {
+ uint32_t encoding;
+ uint32_t length;
+ int8_t contents[];
+} intset;
+
+intset *intsetNew(void);
+intset *intsetAdd(intset *is, int64_t value, uint8_t *success);
+intset *intsetRemove(intset *is, int64_t value, uint8_t *success);
+uint8_t intsetFind(intset *is, int64_t value);
+int64_t intsetRandom(intset *is);
+uint8_t intsetGet(intset *is, uint32_t pos, int64_t *value);
+uint32_t intsetLen(intset *is);
+
+#endif // __INTSET_H
diff --git a/src/multi.c b/src/multi.c
index def1dd67..47615eb0 100644
--- a/src/multi.c
+++ b/src/multi.c
@@ -42,7 +42,7 @@ void queueMultiCommand(redisClient *c, struct redisCommand *cmd) {
void multiCommand(redisClient *c) {
if (c->flags & REDIS_MULTI) {
- addReplySds(c,sdsnew("-ERR MULTI calls can not be nested\r\n"));
+ addReplyError(c,"MULTI calls can not be nested");
return;
}
c->flags |= REDIS_MULTI;
@@ -51,7 +51,7 @@ void multiCommand(redisClient *c) {
void discardCommand(redisClient *c) {
if (!(c->flags & REDIS_MULTI)) {
- addReplySds(c,sdsnew("-ERR DISCARD without MULTI\r\n"));
+ addReplyError(c,"DISCARD without MULTI");
return;
}
@@ -82,7 +82,7 @@ void execCommand(redisClient *c) {
int orig_argc;
if (!(c->flags & REDIS_MULTI)) {
- addReplySds(c,sdsnew("-ERR EXEC without MULTI\r\n"));
+ addReplyError(c,"EXEC without MULTI");
return;
}
@@ -107,7 +107,7 @@ void execCommand(redisClient *c) {
unwatchAllKeys(c); /* Unwatch ASAP otherwise we'll waste CPU cycles */
orig_argv = c->argv;
orig_argc = c->argc;
- addReplySds(c,sdscatprintf(sdsempty(),"*%d\r\n",c->mstate.count));
+ addReplyMultiBulkLen(c,c->mstate.count);
for (j = 0; j < c->mstate.count; j++) {
c->argc = c->mstate.commands[j].argc;
c->argv = c->mstate.commands[j].argv;
@@ -251,7 +251,7 @@ void watchCommand(redisClient *c) {
int j;
if (c->flags & REDIS_MULTI) {
- addReplySds(c,sdsnew("-ERR WATCH inside MULTI is not allowed\r\n"));
+ addReplyError(c,"WATCH inside MULTI is not allowed");
return;
}
for (j = 1; j < c->argc; j++)
diff --git a/src/networking.c b/src/networking.c
index 4e93186e..d1c6a75a 100644
--- a/src/networking.c
+++ b/src/networking.c
@@ -1,5 +1,4 @@
#include "redis.h"
-
#include
void *dupClientReplyValue(void *o) {
@@ -12,14 +11,24 @@ int listMatchObjects(void *a, void *b) {
}
redisClient *createClient(int fd) {
- redisClient *c = zmalloc(sizeof(*c));
+ redisClient *c = zmalloc(sizeof(redisClient));
+ c->bufpos = 0;
anetNonBlock(NULL,fd);
anetTcpNoDelay(NULL,fd);
if (!c) return NULL;
+ if (aeCreateFileEvent(server.el,fd,AE_READABLE,
+ readQueryFromClient, c) == AE_ERR)
+ {
+ close(fd);
+ zfree(c);
+ return NULL;
+ }
+
selectDb(c,0);
c->fd = fd;
c->querybuf = sdsempty();
+ c->newline = NULL;
c->argc = 0;
c->argv = NULL;
c->bulklen = -1;
@@ -43,80 +52,254 @@ redisClient *createClient(int fd) {
c->pubsub_patterns = listCreate();
listSetFreeMethod(c->pubsub_patterns,decrRefCount);
listSetMatchMethod(c->pubsub_patterns,listMatchObjects);
- if (aeCreateFileEvent(server.el, c->fd, AE_READABLE,
- readQueryFromClient, c) == AE_ERR) {
- freeClient(c);
- return NULL;
- }
listAddNodeTail(server.clients,c);
initClientMultiState(c);
return c;
}
-void addReply(redisClient *c, robj *obj) {
- if (listLength(c->reply) == 0 &&
+int _installWriteEvent(redisClient *c) {
+ if (c->fd <= 0) return REDIS_ERR;
+ if (c->bufpos == 0 && listLength(c->reply) == 0 &&
(c->replstate == REDIS_REPL_NONE ||
c->replstate == REDIS_REPL_ONLINE) &&
aeCreateFileEvent(server.el, c->fd, AE_WRITABLE,
- sendReplyToClient, c) == AE_ERR) return;
+ sendReplyToClient, c) == AE_ERR) return REDIS_ERR;
+ return REDIS_OK;
+}
- if (server.vm_enabled && obj->storage != REDIS_VM_MEMORY) {
- obj = dupStringObject(obj);
- obj->refcount = 0; /* getDecodedObject() will increment the refcount */
+/* Create a duplicate of the last object in the reply list when
+ * it is not exclusively owned by the reply list. */
+robj *dupLastObjectIfNeeded(list *reply) {
+ robj *new, *cur;
+ listNode *ln;
+ redisAssert(listLength(reply) > 0);
+ ln = listLast(reply);
+ cur = listNodeValue(ln);
+ if (cur->refcount > 1) {
+ new = dupStringObject(cur);
+ decrRefCount(cur);
+ listNodeValue(ln) = new;
+ }
+ return listNodeValue(ln);
+}
+
+int _addReplyToBuffer(redisClient *c, char *s, size_t len) {
+ size_t available = sizeof(c->buf)-c->bufpos;
+
+ /* If there already are entries in the reply list, we cannot
+ * add anything more to the static buffer. */
+ if (listLength(c->reply) > 0) return REDIS_ERR;
+
+ /* Check that the buffer has enough space available for this string. */
+ if (len > available) return REDIS_ERR;
+
+ memcpy(c->buf+c->bufpos,s,len);
+ c->bufpos+=len;
+ return REDIS_OK;
+}
+
+void _addReplyObjectToList(redisClient *c, robj *o) {
+ robj *tail;
+ if (listLength(c->reply) == 0) {
+ incrRefCount(o);
+ listAddNodeTail(c->reply,o);
+ } else {
+ tail = listNodeValue(listLast(c->reply));
+
+ /* Append to this object when possible. */
+ if (tail->ptr != NULL &&
+ sdslen(tail->ptr)+sdslen(o->ptr) <= REDIS_REPLY_CHUNK_BYTES)
+ {
+ tail = dupLastObjectIfNeeded(c->reply);
+ tail->ptr = sdscatlen(tail->ptr,o->ptr,sdslen(o->ptr));
+ } else {
+ incrRefCount(o);
+ listAddNodeTail(c->reply,o);
+ }
+ }
+}
+
+/* This method takes responsibility over the sds. When it is no longer
+ * needed it will be free'd, otherwise it ends up in a robj. */
+void _addReplySdsToList(redisClient *c, sds s) {
+ robj *tail;
+ if (listLength(c->reply) == 0) {
+ listAddNodeTail(c->reply,createObject(REDIS_STRING,s));
+ } else {
+ tail = listNodeValue(listLast(c->reply));
+
+ /* Append to this object when possible. */
+ if (tail->ptr != NULL &&
+ sdslen(tail->ptr)+sdslen(s) <= REDIS_REPLY_CHUNK_BYTES)
+ {
+ tail = dupLastObjectIfNeeded(c->reply);
+ tail->ptr = sdscatlen(tail->ptr,s,sdslen(s));
+ sdsfree(s);
+ } else {
+ listAddNodeTail(c->reply,createObject(REDIS_STRING,s));
+ }
+ }
+}
+
+void _addReplyStringToList(redisClient *c, char *s, size_t len) {
+ robj *tail;
+ if (listLength(c->reply) == 0) {
+ listAddNodeTail(c->reply,createStringObject(s,len));
+ } else {
+ tail = listNodeValue(listLast(c->reply));
+
+ /* Append to this object when possible. */
+ if (tail->ptr != NULL &&
+ sdslen(tail->ptr)+len <= REDIS_REPLY_CHUNK_BYTES)
+ {
+ tail = dupLastObjectIfNeeded(c->reply);
+ tail->ptr = sdscatlen(tail->ptr,s,len);
+ } else {
+ listAddNodeTail(c->reply,createStringObject(s,len));
+ }
+ }
+}
+
+void addReply(redisClient *c, robj *obj) {
+ if (_installWriteEvent(c) != REDIS_OK) return;
+ redisAssert(!server.vm_enabled || obj->storage == REDIS_VM_MEMORY);
+
+ /* This is an important place where we can avoid copy-on-write
+ * when there is a saving child running, avoiding touching the
+ * refcount field of the object if it's not needed.
+ *
+ * If the encoding is RAW and there is room in the static buffer
+ * we'll be able to send the object to the client without
+ * messing with its page. */
+ if (obj->encoding == REDIS_ENCODING_RAW) {
+ if (_addReplyToBuffer(c,obj->ptr,sdslen(obj->ptr)) != REDIS_OK)
+ _addReplyObjectToList(c,obj);
+ } else {
+ obj = getDecodedObject(obj);
+ if (_addReplyToBuffer(c,obj->ptr,sdslen(obj->ptr)) != REDIS_OK)
+ _addReplyObjectToList(c,obj);
+ decrRefCount(obj);
}
- listAddNodeTail(c->reply,getDecodedObject(obj));
}
void addReplySds(redisClient *c, sds s) {
- robj *o = createObject(REDIS_STRING,s);
- addReply(c,o);
- decrRefCount(o);
+ if (_installWriteEvent(c) != REDIS_OK) {
+ /* The caller expects the sds to be free'd. */
+ sdsfree(s);
+ return;
+ }
+ if (_addReplyToBuffer(c,s,sdslen(s)) == REDIS_OK) {
+ sdsfree(s);
+ } else {
+ /* This method free's the sds when it is no longer needed. */
+ _addReplySdsToList(c,s);
+ }
+}
+
+void addReplyString(redisClient *c, char *s, size_t len) {
+ if (_installWriteEvent(c) != REDIS_OK) return;
+ if (_addReplyToBuffer(c,s,len) != REDIS_OK)
+ _addReplyStringToList(c,s,len);
+}
+
+void _addReplyError(redisClient *c, char *s, size_t len) {
+ addReplyString(c,"-ERR ",5);
+ addReplyString(c,s,len);
+ addReplyString(c,"\r\n",2);
+}
+
+void addReplyError(redisClient *c, char *err) {
+ _addReplyError(c,err,strlen(err));
+}
+
+void addReplyErrorFormat(redisClient *c, const char *fmt, ...) {
+ va_list ap;
+ va_start(ap,fmt);
+ sds s = sdscatvprintf(sdsempty(),fmt,ap);
+ va_end(ap);
+ _addReplyError(c,s,sdslen(s));
+ sdsfree(s);
+}
+
+void _addReplyStatus(redisClient *c, char *s, size_t len) {
+ addReplyString(c,"+",1);
+ addReplyString(c,s,len);
+ addReplyString(c,"\r\n",2);
+}
+
+void addReplyStatus(redisClient *c, char *status) {
+ _addReplyStatus(c,status,strlen(status));
+}
+
+void addReplyStatusFormat(redisClient *c, const char *fmt, ...) {
+ va_list ap;
+ va_start(ap,fmt);
+ sds s = sdscatvprintf(sdsempty(),fmt,ap);
+ va_end(ap);
+ _addReplyStatus(c,s,sdslen(s));
+ sdsfree(s);
+}
+
+/* Adds an empty object to the reply list that will contain the multi bulk
+ * length, which is not known when this function is called. */
+void *addDeferredMultiBulkLength(redisClient *c) {
+ /* Note that we install the write event here even if the object is not
+ * ready to be sent, since we are sure that before returning to the
+ * event loop setDeferredMultiBulkLength() will be called. */
+ if (_installWriteEvent(c) != REDIS_OK) return NULL;
+ listAddNodeTail(c->reply,createObject(REDIS_STRING,NULL));
+ return listLast(c->reply);
+}
+
+/* Populate the length object and try glueing it to the next chunk. */
+void setDeferredMultiBulkLength(redisClient *c, void *node, long length) {
+ listNode *ln = (listNode*)node;
+ robj *len, *next;
+
+ /* Abort when *node is NULL (see addDeferredMultiBulkLength). */
+ if (node == NULL) return;
+
+ len = listNodeValue(ln);
+ len->ptr = sdscatprintf(sdsempty(),"*%ld\r\n",length);
+ if (ln->next != NULL) {
+ next = listNodeValue(ln->next);
+
+ /* Only glue when the next node is non-NULL (an sds in this case) */
+ if (next->ptr != NULL) {
+ len->ptr = sdscatlen(len->ptr,next->ptr,sdslen(next->ptr));
+ listDelNode(c->reply,ln->next);
+ }
+ }
}
void addReplyDouble(redisClient *c, double d) {
- char buf[128];
-
- snprintf(buf,sizeof(buf),"%.17g",d);
- addReplySds(c,sdscatprintf(sdsempty(),"$%lu\r\n%s\r\n",
- (unsigned long) strlen(buf),buf));
+ char dbuf[128], sbuf[128];
+ int dlen, slen;
+ dlen = snprintf(dbuf,sizeof(dbuf),"%.17g",d);
+ slen = snprintf(sbuf,sizeof(sbuf),"$%d\r\n%s\r\n",dlen,dbuf);
+ addReplyString(c,sbuf,slen);
}
-void addReplyLongLong(redisClient *c, long long ll) {
+void _addReplyLongLong(redisClient *c, long long ll, char prefix) {
char buf[128];
- size_t len;
-
- if (ll == 0) {
- addReply(c,shared.czero);
- return;
- } else if (ll == 1) {
- addReply(c,shared.cone);
- return;
- }
- buf[0] = ':';
+ int len;
+ buf[0] = prefix;
len = ll2string(buf+1,sizeof(buf)-1,ll);
buf[len+1] = '\r';
buf[len+2] = '\n';
- addReplySds(c,sdsnewlen(buf,len+3));
+ addReplyString(c,buf,len+3);
}
-void addReplyUlong(redisClient *c, unsigned long ul) {
- char buf[128];
- size_t len;
+void addReplyLongLong(redisClient *c, long long ll) {
+ _addReplyLongLong(c,ll,':');
+}
- if (ul == 0) {
- addReply(c,shared.czero);
- return;
- } else if (ul == 1) {
- addReply(c,shared.cone);
- return;
- }
- len = snprintf(buf,sizeof(buf),":%lu\r\n",ul);
- addReplySds(c,sdsnewlen(buf,len));
+void addReplyMultiBulkLen(redisClient *c, long length) {
+ _addReplyLongLong(c,length,'*');
}
void addReplyBulkLen(redisClient *c, robj *obj) {
- size_t len, intlen;
- char buf[128];
+ size_t len;
if (obj->encoding == REDIS_ENCODING_RAW) {
len = sdslen(obj->ptr);
@@ -133,11 +316,7 @@ void addReplyBulkLen(redisClient *c, robj *obj) {
len++;
}
}
- buf[0] = '$';
- intlen = ll2string(buf+1,sizeof(buf)-1,(long long)len);
- buf[intlen+1] = '\r';
- buf[intlen+2] = '\n';
- addReplySds(c,sdsnewlen(buf,intlen+3));
+ _addReplyLongLong(c,len,'$');
}
void addReplyBulk(redisClient *c, robj *obj) {
@@ -275,7 +454,8 @@ void freeClient(redisClient *c) {
server.vm_blocked_clients--;
}
listRelease(c->io_keys);
- /* Master/slave cleanup */
+ /* Master/slave cleanup.
+ * Case 1: we lost the connection with a slave. */
if (c->flags & REDIS_SLAVE) {
if (c->replstate == REDIS_REPL_SEND_BULK && c->repldbfd != -1)
close(c->repldbfd);
@@ -284,9 +464,20 @@ void freeClient(redisClient *c) {
redisAssert(ln != NULL);
listDelNode(l,ln);
}
+
+ /* Case 2: we lost the connection with the master. */
if (c->flags & REDIS_MASTER) {
server.master = NULL;
server.replstate = REDIS_REPL_CONNECT;
+ /* Since we lost the connection with the master, we should also
+ * close the connection with all our slaves if we have any, so
+ * when we'll resync with the master the other slaves will sync again
+ * with us as well. Note that also when the slave is not connected
+ * to the master it will keep refusing connections by other slaves. */
+ while (listLength(server.slaves)) {
+ ln = listFirst(server.slaves);
+ freeClient((redisClient*)ln->value);
+ }
}
/* Release memory */
zfree(c->argv);
@@ -295,34 +486,6 @@ void freeClient(redisClient *c) {
zfree(c);
}
-#define GLUEREPLY_UP_TO (1024)
-static void glueReplyBuffersIfNeeded(redisClient *c) {
- int copylen = 0;
- char buf[GLUEREPLY_UP_TO];
- listNode *ln;
- listIter li;
- robj *o;
-
- listRewind(c->reply,&li);
- while((ln = listNext(&li))) {
- int objlen;
-
- o = ln->value;
- objlen = sdslen(o->ptr);
- if (copylen + objlen <= GLUEREPLY_UP_TO) {
- memcpy(buf+copylen,o->ptr,objlen);
- copylen += objlen;
- listDelNode(c->reply,ln);
- } else {
- if (copylen == 0) return;
- break;
- }
- }
- /* Now the output buffer is empty, add the new single element */
- o = createObject(REDIS_STRING,sdsnewlen(buf,copylen));
- listAddNodeHead(c->reply,o);
-}
-
void sendReplyToClient(aeEventLoop *el, int fd, void *privdata, int mask) {
redisClient *c = privdata;
int nwritten = 0, totwritten = 0, objlen;
@@ -339,31 +502,48 @@ void sendReplyToClient(aeEventLoop *el, int fd, void *privdata, int mask) {
return;
}
- while(listLength(c->reply)) {
- if (server.glueoutputbuf && listLength(c->reply) > 1)
- glueReplyBuffersIfNeeded(c);
+ while(c->bufpos > 0 || listLength(c->reply)) {
+ if (c->bufpos > 0) {
+ if (c->flags & REDIS_MASTER) {
+ /* Don't reply to a master */
+ nwritten = c->bufpos - c->sentlen;
+ } else {
+ nwritten = write(fd,c->buf+c->sentlen,c->bufpos-c->sentlen);
+ if (nwritten <= 0) break;
+ }
+ c->sentlen += nwritten;
+ totwritten += nwritten;
- o = listNodeValue(listFirst(c->reply));
- objlen = sdslen(o->ptr);
-
- if (objlen == 0) {
- listDelNode(c->reply,listFirst(c->reply));
- continue;
- }
-
- if (c->flags & REDIS_MASTER) {
- /* Don't reply to a master */
- nwritten = objlen - c->sentlen;
+ /* If the buffer was sent, set bufpos to zero to continue with
+ * the remainder of the reply. */
+ if (c->sentlen == c->bufpos) {
+ c->bufpos = 0;
+ c->sentlen = 0;
+ }
} else {
- nwritten = write(fd, ((char*)o->ptr)+c->sentlen, objlen - c->sentlen);
- if (nwritten <= 0) break;
- }
- c->sentlen += nwritten;
- totwritten += nwritten;
- /* If we fully sent the object on head go to the next one */
- if (c->sentlen == objlen) {
- listDelNode(c->reply,listFirst(c->reply));
- c->sentlen = 0;
+ o = listNodeValue(listFirst(c->reply));
+ objlen = sdslen(o->ptr);
+
+ if (objlen == 0) {
+ listDelNode(c->reply,listFirst(c->reply));
+ continue;
+ }
+
+ if (c->flags & REDIS_MASTER) {
+ /* Don't reply to a master */
+ nwritten = objlen - c->sentlen;
+ } else {
+ nwritten = write(fd, ((char*)o->ptr)+c->sentlen,objlen-c->sentlen);
+ if (nwritten <= 0) break;
+ }
+ c->sentlen += nwritten;
+ totwritten += nwritten;
+
+ /* If we fully sent the object on head go to the next one */
+ if (c->sentlen == objlen) {
+ listDelNode(c->reply,listFirst(c->reply));
+ c->sentlen = 0;
+ }
}
/* Note that we avoid to send more thank REDIS_MAX_WRITE_PER_EVENT
* bytes, in a single threaded server it's a good idea to serve
@@ -472,6 +652,7 @@ void resetClient(redisClient *c) {
freeClientArgv(c);
c->bulklen = -1;
c->multibulk = 0;
+ c->newline = NULL;
}
void closeTimedoutClients(void) {
@@ -486,6 +667,7 @@ void closeTimedoutClients(void) {
if (server.maxidletime &&
!(c->flags & REDIS_SLAVE) && /* no timeout for slaves */
!(c->flags & REDIS_MASTER) && /* no timeout for masters */
+ !(c->flags & REDIS_BLOCKED) && /* no timeout for BLPOP */
dictSize(c->pubsub_channels) == 0 && /* no timeout for pubsub */
listLength(c->pubsub_patterns) == 0 &&
(now - c->lastinteraction > server.maxidletime))
@@ -502,6 +684,8 @@ void closeTimedoutClients(void) {
}
void processInputBuffer(redisClient *c) {
+ int seeknewline = 0;
+
again:
/* Before to process the input buffer, make sure the client is not
* waitig for a blocking operation such as BLPOP. Note that the first
@@ -510,15 +694,19 @@ again:
* in the input buffer the client may be blocked, and the "goto again"
* will try to reiterate. The following line will make it return asap. */
if (c->flags & REDIS_BLOCKED || c->flags & REDIS_IO_WAIT) return;
+
+ if (seeknewline && c->bulklen == -1) c->newline = strchr(c->querybuf,'\n');
+ seeknewline = 1;
if (c->bulklen == -1) {
/* Read the first line of the query */
- char *p = strchr(c->querybuf,'\n');
size_t querylen;
- if (p) {
+ if (c->newline) {
+ char *p = c->newline;
sds query, *argv;
int argc, j;
+ c->newline = NULL;
query = c->querybuf;
c->querybuf = sdsempty();
querylen = 1+(p-(query));
@@ -605,8 +793,14 @@ void readQueryFromClient(aeEventLoop *el, int fd, void *privdata, int mask) {
return;
}
if (nread) {
+ size_t oldlen = sdslen(c->querybuf);
c->querybuf = sdscatlen(c->querybuf, buf, nread);
c->lastinteraction = time(NULL);
+ /* Scan this new piece of the query for the newline. We do this
+ * here in order to make sure we perform this scan just one time
+ * per piece of buffer, leading to an O(N) scan instead of O(N*N) */
+ if (c->bulklen == -1 && c->newline == NULL)
+ c->newline = strchr(c->querybuf+oldlen,'\n');
} else {
return;
}
diff --git a/src/object.c b/src/object.c
index 21268340..c1a08245 100644
--- a/src/object.c
+++ b/src/object.c
@@ -74,7 +74,16 @@ robj *createZiplistObject(void) {
robj *createSetObject(void) {
dict *d = dictCreate(&setDictType,NULL);
- return createObject(REDIS_SET,d);
+ robj *o = createObject(REDIS_SET,d);
+ o->encoding = REDIS_ENCODING_HT;
+ return o;
+}
+
+robj *createIntsetObject(void) {
+ intset *is = intsetNew();
+ robj *o = createObject(REDIS_SET,is);
+ o->encoding = REDIS_ENCODING_INTSET;
+ return o;
}
robj *createHashObject(void) {
@@ -115,7 +124,16 @@ void freeListObject(robj *o) {
}
void freeSetObject(robj *o) {
- dictRelease((dict*) o->ptr);
+ switch (o->encoding) {
+ case REDIS_ENCODING_HT:
+ dictRelease((dict*) o->ptr);
+ break;
+ case REDIS_ENCODING_INTSET:
+ zfree(o->ptr);
+ break;
+ default:
+ redisPanic("Unknown set encoding type");
+ }
}
void freeZsetObject(robj *o) {
@@ -336,9 +354,9 @@ int getDoubleFromObjectOrReply(redisClient *c, robj *o, double *target, const ch
double value;
if (getDoubleFromObject(o, &value) != REDIS_OK) {
if (msg != NULL) {
- addReplySds(c, sdscatprintf(sdsempty(), "-ERR %s\r\n", msg));
+ addReplyError(c,(char*)msg);
} else {
- addReplySds(c, sdsnew("-ERR value is not a double\r\n"));
+ addReplyError(c,"value is not a double");
}
return REDIS_ERR;
}
@@ -358,6 +376,8 @@ int getLongLongFromObject(robj *o, long long *target) {
if (o->encoding == REDIS_ENCODING_RAW) {
value = strtoll(o->ptr, &eptr, 10);
if (eptr[0] != '\0') return REDIS_ERR;
+ if (errno == ERANGE && (value == LLONG_MIN || value == LLONG_MAX))
+ return REDIS_ERR;
} else if (o->encoding == REDIS_ENCODING_INT) {
value = (long)o->ptr;
} else {
@@ -365,7 +385,7 @@ int getLongLongFromObject(robj *o, long long *target) {
}
}
- *target = value;
+ if (target) *target = value;
return REDIS_OK;
}
@@ -373,9 +393,9 @@ int getLongLongFromObjectOrReply(redisClient *c, robj *o, long long *target, con
long long value;
if (getLongLongFromObject(o, &value) != REDIS_OK) {
if (msg != NULL) {
- addReplySds(c, sdscatprintf(sdsempty(), "-ERR %s\r\n", msg));
+ addReplyError(c,(char*)msg);
} else {
- addReplySds(c, sdsnew("-ERR value is not an integer\r\n"));
+ addReplyError(c,"value is not an integer or out of range");
}
return REDIS_ERR;
}
@@ -390,9 +410,9 @@ int getLongFromObjectOrReply(redisClient *c, robj *o, long *target, const char *
if (getLongLongFromObjectOrReply(c, o, &value, msg) != REDIS_OK) return REDIS_ERR;
if (value < LONG_MIN || value > LONG_MAX) {
if (msg != NULL) {
- addReplySds(c, sdscatprintf(sdsempty(), "-ERR %s\r\n", msg));
+ addReplyError(c,(char*)msg);
} else {
- addReplySds(c, sdsnew("-ERR value is out of range\r\n"));
+ addReplyError(c,"value is out of range");
}
return REDIS_ERR;
}
@@ -409,6 +429,7 @@ char *strEncoding(int encoding) {
case REDIS_ENCODING_ZIPMAP: return "zipmap";
case REDIS_ENCODING_LINKEDLIST: return "linkedlist";
case REDIS_ENCODING_ZIPLIST: return "ziplist";
+ case REDIS_ENCODING_INTSET: return "intset";
default: return "unknown";
}
}
diff --git a/src/rdb.c b/src/rdb.c
index 3fa284e1..589b536a 100644
--- a/src/rdb.c
+++ b/src/rdb.c
@@ -260,17 +260,29 @@ int rdbSaveObject(FILE *fp, robj *o) {
}
} else if (o->type == REDIS_SET) {
/* Save a set value */
- dict *set = o->ptr;
- dictIterator *di = dictGetIterator(set);
- dictEntry *de;
+ if (o->encoding == REDIS_ENCODING_HT) {
+ dict *set = o->ptr;
+ dictIterator *di = dictGetIterator(set);
+ dictEntry *de;
- if (rdbSaveLen(fp,dictSize(set)) == -1) return -1;
- while((de = dictNext(di)) != NULL) {
- robj *eleobj = dictGetEntryKey(de);
+ if (rdbSaveLen(fp,dictSize(set)) == -1) return -1;
+ while((de = dictNext(di)) != NULL) {
+ robj *eleobj = dictGetEntryKey(de);
+ if (rdbSaveStringObject(fp,eleobj) == -1) return -1;
+ }
+ dictReleaseIterator(di);
+ } else if (o->encoding == REDIS_ENCODING_INTSET) {
+ intset *is = o->ptr;
+ int64_t llval;
+ int i = 0;
- if (rdbSaveStringObject(fp,eleobj) == -1) return -1;
+ if (rdbSaveLen(fp,intsetLen(is)) == -1) return -1;
+ while(intsetGet(is,i++,&llval)) {
+ if (rdbSaveLongLongAsStringObject(fp,llval) == -1) return -1;
+ }
+ } else {
+ redisPanic("Unknown set encoding");
}
- dictReleaseIterator(di);
} else if (o->type == REDIS_ZSET) {
/* Save a set value */
zset *zs = o->ptr;
@@ -445,6 +457,7 @@ int rdbSaveBackground(char *filename) {
if (server.bgsavechildpid != -1) return REDIS_ERR;
if (server.vm_enabled) waitEmptyIOJobsQueue();
+ server.dirty_before_bgsave = server.dirty;
if ((childpid = fork()) == 0) {
/* Child */
if (server.vm_enabled) vmReopenSwapFile();
@@ -629,6 +642,7 @@ int rdbLoadDoubleValue(FILE *fp, double *val) {
robj *rdbLoadObject(int type, FILE *fp) {
robj *o, *ele, *dec;
size_t len;
+ unsigned int i;
redisLog(REDIS_DEBUG,"LOADING OBJECT %d (at %d)\n",type,ftell(fp));
if (type == REDIS_STRING) {
@@ -670,16 +684,41 @@ robj *rdbLoadObject(int type, FILE *fp) {
} else if (type == REDIS_SET) {
/* Read list/set value */
if ((len = rdbLoadLen(fp,NULL)) == REDIS_RDB_LENERR) return NULL;
- o = createSetObject();
- /* It's faster to expand the dict to the right size asap in order
- * to avoid rehashing */
- if (len > DICT_HT_INITIAL_SIZE)
- dictExpand(o->ptr,len);
+
+ /* Use a regular set when there are too many entries. */
+ if (len > server.set_max_intset_entries) {
+ o = createSetObject();
+ /* It's faster to expand the dict to the right size asap in order
+ * to avoid rehashing */
+ if (len > DICT_HT_INITIAL_SIZE)
+ dictExpand(o->ptr,len);
+ } else {
+ o = createIntsetObject();
+ }
+
/* Load every single element of the list/set */
- while(len--) {
+ for (i = 0; i < len; i++) {
+ long long llval;
if ((ele = rdbLoadEncodedStringObject(fp)) == NULL) return NULL;
ele = tryObjectEncoding(ele);
- dictAdd((dict*)o->ptr,ele,NULL);
+
+ if (o->encoding == REDIS_ENCODING_INTSET) {
+ /* Fetch integer value from element */
+ if (isObjectRepresentableAsLongLong(ele,&llval) == REDIS_OK) {
+ o->ptr = intsetAdd(o->ptr,llval,NULL);
+ } else {
+ setTypeConvert(o,REDIS_ENCODING_HT);
+ dictExpand(o->ptr,len);
+ }
+ }
+
+ /* This will also be called when the set was just converted
+ * to regular hashtable encoded set */
+ if (o->encoding == REDIS_ENCODING_HT) {
+ dictAdd((dict*)o->ptr,ele,NULL);
+ } else {
+ decrRefCount(ele);
+ }
}
} else if (type == REDIS_ZSET) {
/* Read list/set value */
@@ -692,13 +731,14 @@ robj *rdbLoadObject(int type, FILE *fp) {
/* Load every single element of the list/set */
while(zsetlen--) {
robj *ele;
- double *score = zmalloc(sizeof(double));
+ double score;
+ zskiplistNode *znode;
if ((ele = rdbLoadEncodedStringObject(fp)) == NULL) return NULL;
ele = tryObjectEncoding(ele);
- if (rdbLoadDoubleValue(fp,score) == -1) return NULL;
- dictAdd(zs->dict,ele,score);
- zslInsert(zs->zsl,*score,ele);
+ if (rdbLoadDoubleValue(fp,&score) == -1) return NULL;
+ znode = zslInsert(zs->zsl,score,ele);
+ dictAdd(zs->dict,ele,&znode->score);
incrRefCount(ele); /* added to skiplist */
}
} else if (type == REDIS_HASH) {
@@ -876,7 +916,7 @@ void backgroundSaveDoneHandler(int statloc) {
if (!bysignal && exitcode == 0) {
redisLog(REDIS_NOTICE,
"Background saving terminated with success");
- server.dirty = 0;
+ server.dirty = server.dirty - server.dirty_before_bgsave;
server.lastsave = time(NULL);
} else if (!bysignal && exitcode != 0) {
redisLog(REDIS_WARNING, "Background saving error");
diff --git a/src/redis-benchmark.c b/src/redis-benchmark.c
index b3e729f1..68c46ad8 100644
--- a/src/redis-benchmark.c
+++ b/src/redis-benchmark.c
@@ -76,6 +76,7 @@ static struct config {
long long start;
long long totlatency;
int *latency;
+ char *title;
list *clients;
int quiet;
int loop;
@@ -207,16 +208,27 @@ static void clientDone(client c) {
}
}
+/* Read a length from the buffer pointed to by *p, store the length in *len,
+ * and return the number of bytes that the cursor advanced. */
+static int readLen(char *p, int *len) {
+ char *tail = strstr(p,"\r\n");
+ if (tail == NULL)
+ return 0;
+ *tail = '\0';
+ *len = atoi(p+1);
+ return tail+2-p;
+}
+
static void readHandler(aeEventLoop *el, int fd, void *privdata, int mask)
{
- char buf[1024];
- int nread;
+ char buf[1024], *p;
+ int nread, pos=0, len=0;
client c = privdata;
REDIS_NOTUSED(el);
REDIS_NOTUSED(fd);
REDIS_NOTUSED(mask);
- nread = read(c->fd, buf, 1024);
+ nread = read(c->fd,buf,sizeof(buf));
if (nread == -1) {
fprintf(stderr, "Reading from socket: %s\n", strerror(errno));
freeClient(c);
@@ -229,82 +241,89 @@ static void readHandler(aeEventLoop *el, int fd, void *privdata, int mask)
}
c->totreceived += nread;
c->ibuf = sdscatlen(c->ibuf,buf,nread);
+ len = sdslen(c->ibuf);
-processdata:
- /* Are we waiting for the first line of the command of for sdf
- * count in bulk or multi bulk operations? */
if (c->replytype == REPLY_INT ||
- c->replytype == REPLY_RETCODE ||
- (c->replytype == REPLY_BULK && c->readlen == -1) ||
- (c->replytype == REPLY_MBULK && c->readlen == -1) ||
- (c->replytype == REPLY_MBULK && c->mbulk == -1)) {
- char *p;
-
- /* Check if the first line is complete. This is only true if
- * there is a newline inside the buffer. */
- if ((p = strchr(c->ibuf,'\n')) != NULL) {
- if (c->replytype == REPLY_BULK ||
- (c->replytype == REPLY_MBULK && c->mbulk != -1))
- {
- /* Read the count of a bulk reply (being it a single bulk or
- * a multi bulk reply). "$" for the protocol spec. */
- *p = '\0';
- *(p-1) = '\0';
- c->readlen = atoi(c->ibuf+1)+2;
- // printf("BULK ATOI: %s\n", c->ibuf+1);
- /* Handle null bulk reply "$-1" */
- if (c->readlen-2 == -1) {
- clientDone(c);
- return;
- }
- /* Leave all the rest in the input buffer */
- c->ibuf = sdsrange(c->ibuf,(p-c->ibuf)+1,-1);
- /* fall through to reach the point where the code will try
- * to check if the bulk reply is complete. */
- } else if (c->replytype == REPLY_MBULK && c->mbulk == -1) {
- /* Read the count of a multi bulk reply. That is, how many
- * bulk replies we have to read next. "*" protocol. */
- *p = '\0';
- *(p-1) = '\0';
- c->mbulk = atoi(c->ibuf+1);
- /* Handle null bulk reply "*-1" */
- if (c->mbulk == -1) {
- clientDone(c);
- return;
- }
- // printf("%p) %d elements list\n", c, c->mbulk);
- /* Leave all the rest in the input buffer */
- c->ibuf = sdsrange(c->ibuf,(p-c->ibuf)+1,-1);
- goto processdata;
- } else {
- c->ibuf = sdstrim(c->ibuf,"\r\n");
- clientDone(c);
- return;
- }
- }
- }
- /* bulk read, did we read everything? */
- if (((c->replytype == REPLY_MBULK && c->mbulk != -1) ||
- (c->replytype == REPLY_BULK)) && c->readlen != -1 &&
- (unsigned)c->readlen <= sdslen(c->ibuf))
+ c->replytype == REPLY_RETCODE)
{
- // printf("BULKSTATUS mbulk:%d readlen:%d sdslen:%d\n",
- // c->mbulk,c->readlen,sdslen(c->ibuf));
- if (c->replytype == REPLY_BULK) {
- clientDone(c);
- } else if (c->replytype == REPLY_MBULK) {
- // printf("%p) %d (%d)) ",c, c->mbulk, c->readlen);
- // fwrite(c->ibuf,c->readlen,1,stdout);
- // printf("\n");
- if (--c->mbulk == 0) {
- clientDone(c);
+ /* Check if the first line is complete. This is everything we need
+ * when waiting for an integer or status code reply.*/
+ if ((p = strstr(c->ibuf,"\r\n")) != NULL)
+ goto done;
+ } else if (c->replytype == REPLY_BULK) {
+ int advance = 0;
+ if (c->readlen < 0) {
+ advance = readLen(c->ibuf+pos,&c->readlen);
+ if (advance) {
+ pos += advance;
+ if (c->readlen == -1) {
+ goto done;
+ } else {
+ /* include the trailing \r\n */
+ c->readlen += 2;
+ }
} else {
- c->ibuf = sdsrange(c->ibuf,c->readlen,-1);
- c->readlen = -1;
- goto processdata;
+ goto skip;
}
}
+
+ int canconsume;
+ if (c->readlen > 0) {
+ canconsume = c->readlen > (len-pos) ? (len-pos) : c->readlen;
+ c->readlen -= canconsume;
+ pos += canconsume;
+ }
+
+ if (c->readlen == 0)
+ goto done;
+ } else if (c->replytype == REPLY_MBULK) {
+ int advance = 0;
+ if (c->mbulk == -1) {
+ advance = readLen(c->ibuf+pos,&c->mbulk);
+ if (advance) {
+ pos += advance;
+ if (c->mbulk == -1)
+ goto done;
+ } else {
+ goto skip;
+ }
+ }
+
+ int canconsume;
+ while(c->mbulk > 0 && pos < len) {
+ if (c->readlen > 0) {
+ canconsume = c->readlen > (len-pos) ? (len-pos) : c->readlen;
+ c->readlen -= canconsume;
+ pos += canconsume;
+ if (c->readlen == 0)
+ c->mbulk--;
+ } else {
+ advance = readLen(c->ibuf+pos,&c->readlen);
+ if (advance) {
+ pos += advance;
+ if (c->readlen == -1) {
+ c->mbulk--;
+ continue;
+ } else {
+ /* include the trailing \r\n */
+ c->readlen += 2;
+ }
+ } else {
+ goto skip;
+ }
+ }
+ }
+
+ if (c->mbulk == 0)
+ goto done;
}
+
+skip:
+ c->ibuf = sdsrange(c->ibuf,pos,-1);
+ return;
+done:
+ clientDone(c);
+ return;
}
static void writeHandler(aeEventLoop *el, int fd, void *privdata, int mask)
@@ -376,13 +395,13 @@ static void createMissingClients(client c) {
}
}
-static void showLatencyReport(char *title) {
+static void showLatencyReport(void) {
int j, seen = 0;
float perc, reqpersec;
reqpersec = (float)config.donerequests/((float)config.totlatency/1000);
if (!config.quiet) {
- printf("====== %s ======\n", title);
+ printf("====== %s ======\n", config.title);
printf(" %d requests completed in %.2f seconds\n", config.donerequests,
(float)config.totlatency/1000);
printf(" %d parallel clients\n", config.numclients);
@@ -398,20 +417,20 @@ static void showLatencyReport(char *title) {
}
printf("%.2f requests per second\n\n", reqpersec);
} else {
- printf("%s: %.2f requests per second\n", title, reqpersec);
+ printf("%s: %.2f requests per second\n", config.title, reqpersec);
}
}
-static void prepareForBenchmark(void)
-{
+static void prepareForBenchmark(char *title) {
memset(config.latency,0,sizeof(int)*(MAX_LATENCY+1));
+ config.title = title;
config.start = mstime();
config.donerequests = 0;
}
-static void endBenchmark(char *title) {
+static void endBenchmark(void) {
config.totlatency = mstime()-config.start;
- showLatencyReport(title);
+ showLatencyReport();
freeAllClients();
}
@@ -489,6 +508,18 @@ void parseOptions(int argc, char **argv) {
}
}
+int showThroughput(struct aeEventLoop *eventLoop, long long id, void *clientData) {
+ REDIS_NOTUSED(eventLoop);
+ REDIS_NOTUSED(id);
+ REDIS_NOTUSED(clientData);
+
+ float dt = (float)(mstime()-config.start)/1000.0;
+ float rps = (float)config.donerequests/dt;
+ printf("%s: %.2f\r", config.title, rps);
+ fflush(stdout);
+ return 250; /* every 250ms */
+}
+
int main(int argc, char **argv) {
client c;
@@ -500,6 +531,7 @@ int main(int argc, char **argv) {
config.requests = 10000;
config.liveclients = 0;
config.el = aeCreateEventLoop();
+ aeCreateTimeEvent(config.el,1,showThroughput,NULL,NULL);
config.keepalive = 1;
config.donerequests = 0;
config.datasize = 3;
@@ -524,7 +556,7 @@ int main(int argc, char **argv) {
if (config.idlemode) {
printf("Creating %d idle connections and waiting forever (Ctrl+C when done)\n", config.numclients);
- prepareForBenchmark();
+ prepareForBenchmark("IDLE");
c = createClient();
if (!c) exit(1);
c->obuf = sdsempty();
@@ -535,25 +567,25 @@ int main(int argc, char **argv) {
}
do {
- prepareForBenchmark();
+ prepareForBenchmark("PING");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"PING\r\n");
prepareClientForReply(c,REPLY_RETCODE);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("PING");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("PING (multi bulk)");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"*1\r\n$4\r\nPING\r\n");
prepareClientForReply(c,REPLY_RETCODE);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("PING (multi bulk)");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("SET");
c = createClient();
if (!c) exit(1);
c->obuf = sdscatprintf(c->obuf,"SET foo_rand000000000000 %d\r\n",config.datasize);
@@ -567,106 +599,106 @@ int main(int argc, char **argv) {
prepareClientForReply(c,REPLY_RETCODE);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("SET");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("GET");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"GET foo_rand000000000000\r\n");
prepareClientForReply(c,REPLY_BULK);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("GET");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("INCR");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"INCR counter_rand000000000000\r\n");
prepareClientForReply(c,REPLY_INT);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("INCR");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("LPUSH");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"LPUSH mylist 3\r\nbar\r\n");
prepareClientForReply(c,REPLY_INT);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("LPUSH");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("LPOP");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"LPOP mylist\r\n");
prepareClientForReply(c,REPLY_BULK);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("LPOP");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("SADD");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"SADD myset 24\r\ncounter_rand000000000000\r\n");
prepareClientForReply(c,REPLY_RETCODE);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("SADD");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("SPOP");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"SPOP myset\r\n");
prepareClientForReply(c,REPLY_BULK);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("SPOP");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("LPUSH (again, in order to bench LRANGE)");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"LPUSH mylist 3\r\nbar\r\n");
prepareClientForReply(c,REPLY_RETCODE);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("LPUSH (again, in order to bench LRANGE)");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("LRANGE (first 100 elements)");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"LRANGE mylist 0 99\r\n");
prepareClientForReply(c,REPLY_MBULK);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("LRANGE (first 100 elements)");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("LRANGE (first 300 elements)");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"LRANGE mylist 0 299\r\n");
prepareClientForReply(c,REPLY_MBULK);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("LRANGE (first 300 elements)");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("LRANGE (first 450 elements)");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"LRANGE mylist 0 449\r\n");
prepareClientForReply(c,REPLY_MBULK);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("LRANGE (first 450 elements)");
+ endBenchmark();
- prepareForBenchmark();
+ prepareForBenchmark("LRANGE (first 600 elements)");
c = createClient();
if (!c) exit(1);
c->obuf = sdscat(c->obuf,"LRANGE mylist 0 599\r\n");
prepareClientForReply(c,REPLY_MBULK);
createMissingClients(c);
aeMain(config.el);
- endBenchmark("LRANGE (first 600 elements)");
+ endBenchmark();
printf("\n");
} while(config.loop);
diff --git a/src/redis-check-dump.c b/src/redis-check-dump.c
index 0b002790..987e1db3 100644
--- a/src/redis-check-dump.c
+++ b/src/redis-check-dump.c
@@ -65,8 +65,8 @@
/* data type to hold offset in file and size */
typedef struct {
void *data;
- unsigned long size;
- unsigned long offset;
+ size_t size;
+ size_t offset;
} pos;
static unsigned char level = 0;
@@ -77,8 +77,8 @@ static pos positions[16];
/* Hold a stack of errors */
typedef struct {
char error[16][1024];
- unsigned long offset[16];
- unsigned int level;
+ size_t offset[16];
+ size_t level;
} errors_t;
static errors_t errors;
@@ -112,7 +112,7 @@ int readBytes(void *target, long num) {
if (p.offset + num > p.size) {
return 0;
} else {
- memcpy(target, (void*)((unsigned long)p.data + p.offset), num);
+ memcpy(target, (void*)((size_t)p.data + p.offset), num);
if (!peek) positions[level].offset += num;
}
return 1;
@@ -494,15 +494,17 @@ void printCentered(int indent, int width, char* body) {
printf("%s %s %s\n", head, body, tail);
}
-void printValid(int ops, int bytes) {
+void printValid(uint64_t ops, uint64_t bytes) {
char body[80];
- sprintf(body, "Processed %d valid opcodes (in %d bytes)", ops, bytes);
+ sprintf(body, "Processed %llu valid opcodes (in %llu bytes)",
+ (unsigned long long) ops, (unsigned long long) bytes);
printCentered(4, 80, body);
}
-void printSkipped(int bytes, int offset) {
+void printSkipped(uint64_t bytes, uint64_t offset) {
char body[80];
- sprintf(body, "Skipped %d bytes (resuming at 0x%08x)", bytes, offset);
+ sprintf(body, "Skipped %llu bytes (resuming at 0x%08llx)",
+ (unsigned long long) bytes, (unsigned long long) offset);
printCentered(4, 80, body);
}
@@ -541,7 +543,7 @@ void printErrorStack(entry *e) {
}
void process() {
- int i, num_errors = 0, num_valid_ops = 0, num_valid_bytes = 0;
+ uint64_t num_errors = 0, num_valid_ops = 0, num_valid_bytes = 0;
entry entry;
processHeader();
@@ -558,7 +560,9 @@ void process() {
num_valid_bytes = 0;
/* search for next valid entry */
- unsigned long offset = positions[0].offset + 1;
+ uint64_t offset = positions[0].offset + 1;
+ int i = 0;
+
while (!entry.success && offset < positions[0].size) {
positions[1].offset = offset;
@@ -606,9 +610,10 @@ void process() {
}
/* print summary on errors */
- if (num_errors > 0) {
+ if (num_errors) {
printf("\n");
- printf("Total unprocessable opcodes: %d\n", num_errors);
+ printf("Total unprocessable opcodes: %llu\n",
+ (unsigned long long) num_errors);
}
}
@@ -620,7 +625,7 @@ int main(int argc, char **argv) {
}
int fd;
- unsigned long size;
+ off_t size;
struct stat stat;
void *data;
@@ -634,6 +639,10 @@ int main(int argc, char **argv) {
size = stat.st_size;
}
+ if (sizeof(size_t) == sizeof(int32_t) && size >= INT_MAX) {
+ ERROR("Cannot check dump files >2GB on a 32-bit platform\n");
+ }
+
data = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
if (data == MAP_FAILED) {
ERROR("Cannot mmap: %s\n", argv[1]);
diff --git a/src/redis-cli.c b/src/redis-cli.c
index 4dafba32..8866678b 100644
--- a/src/redis-cli.c
+++ b/src/redis-cli.c
@@ -36,6 +36,8 @@
#include
#include
#include
+#include
+#include
#include "anet.h"
#include "sds.h"
@@ -55,12 +57,14 @@ static struct config {
char *hostsocket;
long repeat;
int dbnum;
- int argn_from_stdin;
int interactive;
int shutdown;
int monitor_mode;
int pubsub_mode;
- int raw_output;
+ int raw_output; /* output mode per command */
+ int tty; /* flag for default output format */
+ int stdinarg; /* get last arg from stdin. (-x option) */
+ char mb_sep;
char *auth;
char *historyfile;
} config;
@@ -68,19 +72,18 @@ static struct config {
static int cliReadReply(int fd);
static void usage();
-static int cliConnect(void) {
+/* Connect to the client. If force is not zero the connection is performed
+ * even if there is already a connected socket. */
+static int cliConnect(int force) {
char err[ANET_ERR_LEN];
static int fd = ANET_ERR;
- if (fd == ANET_ERR) {
+ if (fd == ANET_ERR || force) {
+ if (force) close(fd);
if (config.hostsocket == NULL) {
fd = anetTcpConnect(err,config.hostip,config.hostport);
} else {
fd = anetUnixConnect(err,config.hostsocket);
- if (fd == ANET_ERR) {
- fprintf(stderr, "Could not connect to Redis at %s: %s", config.hostsocket, err);
- return -1;
- }
}
if (fd == ANET_ERR) {
fprintf(stderr,"Could not connect to Redis at ");
@@ -103,7 +106,7 @@ static sds cliReadLine(int fd) {
ssize_t ret;
ret = read(fd,&c,1);
- if (ret == -1) {
+ if (ret <= 0) {
sdsfree(line);
return NULL;
} else if ((ret == 0) || (c == '\n')) {
@@ -120,7 +123,7 @@ static int cliReadSingleLineReply(int fd, int quiet) {
if (reply == NULL) return 1;
if (!quiet)
- printf("%s\n", reply);
+ printf("%s", reply);
sdsfree(reply);
return 0;
}
@@ -147,7 +150,7 @@ static void printStringRepr(char *s, int len) {
}
s++;
}
- printf("\"\n");
+ printf("\"");
}
static int cliReadBulkReply(int fd) {
@@ -165,7 +168,7 @@ static int cliReadBulkReply(int fd) {
reply = zmalloc(bulklen);
anetRead(fd,reply,bulklen);
anetRead(fd,crlf,2);
- if (config.raw_output || !isatty(fileno(stdout))) {
+ if (config.raw_output || !config.tty) {
if (bulklen && fwrite(reply,bulklen,1,stdout) == 0) {
zfree(reply);
return 1;
@@ -182,6 +185,7 @@ static int cliReadBulkReply(int fd) {
static int cliReadMultiBulkReply(int fd) {
sds replylen = cliReadLine(fd);
int elements, c = 1;
+ int retval = 0;
if (replylen == NULL) return 1;
elements = atoi(replylen);
@@ -194,36 +198,45 @@ static int cliReadMultiBulkReply(int fd) {
printf("(empty list or set)\n");
}
while(elements--) {
- printf("%d. ", c);
- if (cliReadReply(fd)) return 1;
+ if (config.tty) printf("%d. ", c);
+ if (cliReadReply(fd)) retval = 1;
+ if (elements) printf("%c",config.mb_sep);
c++;
}
- return 0;
+ return retval;
}
static int cliReadReply(int fd) {
char type;
+ int nread;
- if (anetRead(fd,&type,1) <= 0) {
+ if ((nread = anetRead(fd,&type,1)) <= 0) {
if (config.shutdown) return 0;
- exit(1);
+ if (config.interactive &&
+ (nread == 0 || (nread == -1 && errno == ECONNRESET)))
+ {
+ return ECONNRESET;
+ } else {
+ printf("I/O error while reading from socket: %s",strerror(errno));
+ exit(1);
+ }
}
switch(type) {
case '-':
- printf("(error) ");
+ if (config.tty) printf("(error) ");
cliReadSingleLineReply(fd,0);
return 1;
case '+':
return cliReadSingleLineReply(fd,0);
case ':':
- printf("(integer) ");
+ if (config.tty) printf("(integer) ");
return cliReadSingleLineReply(fd,0);
case '$':
return cliReadBulkReply(fd);
case '*':
return cliReadMultiBulkReply(fd);
default:
- printf("protocol error, got '%c' as reply type byte\n", type);
+ printf("protocol error, got '%c' as reply type byte", type);
return 1;
}
}
@@ -248,17 +261,37 @@ static int selectDb(int fd) {
return 0;
}
+static void showInteractiveHelp(void) {
+ printf(
+ "\n"
+ "Welcome to redis-cli " REDIS_VERSION "!\n"
+ "Just type any valid Redis command to see a pretty printed output.\n"
+ "\n"
+ "It is possible to quote strings, like in:\n"
+ " set \"my key\" \"some string \\xff\\n\"\n"
+ "\n"
+ "You can find a list of valid Redis commands at\n"
+ " http://code.google.com/p/redis/wiki/CommandReference\n"
+ "\n"
+ "Note: redis-cli supports line editing, use up/down arrows for history."
+ "\n\n");
+}
+
static int cliSendCommand(int argc, char **argv, int repeat) {
char *command = argv[0];
int fd, j, retval = 0;
sds cmd;
config.raw_output = !strcasecmp(command,"info");
+ if (!strcasecmp(command,"help")) {
+ showInteractiveHelp();
+ return 0;
+ }
if (!strcasecmp(command,"shutdown")) config.shutdown = 1;
if (!strcasecmp(command,"monitor")) config.monitor_mode = 1;
if (!strcasecmp(command,"subscribe") ||
!strcasecmp(command,"psubscribe")) config.pubsub_mode = 1;
- if ((fd = cliConnect()) == -1) return 1;
+ if ((fd = cliConnect(0)) == -1) return 1;
/* Select db number */
retval = selectDb(fd);
@@ -279,21 +312,21 @@ static int cliSendCommand(int argc, char **argv, int repeat) {
while(repeat--) {
anetWrite(fd,cmd,sdslen(cmd));
while (config.monitor_mode) {
- cliReadSingleLineReply(fd,0);
+ if (cliReadSingleLineReply(fd,0)) exit(1);
+ printf("\n");
}
if (config.pubsub_mode) {
printf("Reading messages... (press Ctrl-c to quit)\n");
while (1) {
cliReadReply(fd);
- printf("\n");
+ printf("\n\n");
}
}
retval = cliReadReply(fd);
- if (retval) {
- return retval;
- }
+ if (!config.raw_output && config.tty) printf("\n");
+ if (retval) return retval;
}
return 0;
}
@@ -314,6 +347,8 @@ static int parseOptions(int argc, char **argv) {
i++;
} else if (!strcmp(argv[i],"-h") && lastarg) {
usage();
+ } else if (!strcmp(argv[i],"-x")) {
+ config.stdinarg = 1;
} else if (!strcmp(argv[i],"-p") && !lastarg) {
config.hostport = atoi(argv[i+1]);
i++;
@@ -330,11 +365,18 @@ static int parseOptions(int argc, char **argv) {
config.auth = argv[i+1];
i++;
} else if (!strcmp(argv[i],"-i")) {
- config.interactive = 1;
+ fprintf(stderr,
+"Starting interactive mode using -i is deprecated. Interactive mode is started\n"
+"by default when redis-cli is executed without a command to execute.\n"
+ );
} else if (!strcmp(argv[i],"-c")) {
- config.argn_from_stdin = 1;
+ fprintf(stderr,
+"Reading last argument from standard input using -c is deprecated.\n"
+"When standard input is connected to a pipe or regular file, it is\n"
+"automatically used as last argument.\n"
+ );
} else if (!strcmp(argv[i],"-v")) {
- printf("redis-cli shipped with Redis verison %s\n", REDIS_VERSION);
+ printf("redis-cli shipped with Redis version %s\n", REDIS_VERSION);
exit(0);
} else {
break;
@@ -362,9 +404,8 @@ static sds readArgFromStdin(void) {
static void usage() {
fprintf(stderr, "usage: redis-cli [-iv] [-h host] [-p port] [-s /path/to/socket] [-a authpw] [-r repeat_times] [-n db_num] cmd arg1 arg2 arg3 ... argN\n");
- fprintf(stderr, "usage: echo \"argN\" | redis-cli -c [-h host] [-p port] [-s /path/to/socket] [-a authpw] [-r repeat_times] [-n db_num] cmd arg1 arg2 ... arg(N-1)\n");
- fprintf(stderr, "\nIf a pipe from standard input is detected this data is used as last argument.\n\n");
- fprintf(stderr, "example: cat /etc/passwd | redis-cli set my_passwd\n");
+ fprintf(stderr, "usage: echo \"argN\" | redis-cli -x [options] cmd arg1 arg2 ... arg(N-1)\n\n");
+ fprintf(stderr, "example: cat /etc/passwd | redis-cli -x set my_passwd\n");
fprintf(stderr, "example: redis-cli get my_passwd\n");
fprintf(stderr, "example: redis-cli -r 100 lpush mylist x\n");
fprintf(stderr, "\nRun in interactive mode: redis-cli -i or just don't pass any command\n");
@@ -382,87 +423,39 @@ static char **convertToSds(int count, char** args) {
return sds;
}
-static char **splitArguments(char *line, int *argc) {
- char *p = line;
- char *current = NULL;
- char **vector = NULL;
-
- *argc = 0;
- while(1) {
- /* skip blanks */
- while(*p && isspace(*p)) p++;
- if (*p) {
- /* get a token */
- int inq=0; /* set to 1 if we are in "quotes" */
- int done = 0;
-
- if (current == NULL) current = sdsempty();
- while(!done) {
- if (inq) {
- if (*p == '\\' && *(p+1)) {
- char c;
-
- p++;
- switch(*p) {
- case 'n': c = '\n'; break;
- case 'r': c = '\r'; break;
- case 't': c = '\t'; break;
- case 'b': c = '\b'; break;
- case 'a': c = '\a'; break;
- default: c = *p; break;
- }
- current = sdscatlen(current,&c,1);
- } else if (*p == '"') {
- done = 1;
- } else {
- current = sdscatlen(current,p,1);
- }
- } else {
- switch(*p) {
- case ' ':
- case '\n':
- case '\r':
- case '\t':
- case '\0':
- done=1;
- break;
- case '"':
- inq=1;
- break;
- default:
- current = sdscatlen(current,p,1);
- break;
- }
- }
- if (*p) p++;
- }
- /* add the token to the vector */
- vector = zrealloc(vector,((*argc)+1)*sizeof(char*));
- vector[*argc] = current;
- (*argc)++;
- current = NULL;
- } else {
- return vector;
- }
- }
-}
-
#define LINE_BUFLEN 4096
static void repl() {
int argc, j;
- char *line, **argv;
+ char *line;
+ sds *argv;
+ config.interactive = 1;
while((line = linenoise("redis> ")) != NULL) {
if (line[0] != '\0') {
- argv = splitArguments(line,&argc);
+ argv = sdssplitargs(line,&argc);
linenoiseHistoryAdd(line);
if (config.historyfile) linenoiseHistorySave(config.historyfile);
- if (argc > 0) {
+ if (argv == NULL) {
+ printf("Invalid argument(s)\n");
+ continue;
+ } else if (argc > 0) {
if (strcasecmp(argv[0],"quit") == 0 ||
strcasecmp(argv[0],"exit") == 0)
- exit(0);
- else
- cliSendCommand(argc, argv, 1);
+ {
+ exit(0);
+ } else {
+ int err;
+
+ if ((err = cliSendCommand(argc, argv, 1)) != 0) {
+ if (err == ECONNRESET) {
+ printf("Reconnecting... ");
+ fflush(stdout);
+ if (cliConnect(1) == -1) exit(1);
+ printf("OK\n");
+ cliSendCommand(argc,argv,1);
+ }
+ }
+ }
}
/* Free the argument vector */
for (j = 0; j < argc; j++)
@@ -475,23 +468,37 @@ static void repl() {
exit(0);
}
+static int noninteractive(int argc, char **argv) {
+ int retval = 0;
+ if (config.stdinarg) {
+ argv = zrealloc(argv, (argc+1)*sizeof(char*));
+ argv[argc] = readArgFromStdin();
+ retval = cliSendCommand(argc+1, argv, config.repeat);
+ } else {
+ /* stdin is probably a tty, can be tested with S_ISCHR(s.st_mode) */
+ retval = cliSendCommand(argc, argv, config.repeat);
+ }
+ return retval;
+}
+
int main(int argc, char **argv) {
int firstarg;
- char **argvcopy;
config.hostip = "127.0.0.1";
config.hostport = 6379;
config.hostsocket = NULL;
config.repeat = 1;
config.dbnum = 0;
- config.argn_from_stdin = 0;
- config.shutdown = 0;
config.interactive = 0;
+ config.shutdown = 0;
config.monitor_mode = 0;
config.pubsub_mode = 0;
config.raw_output = 0;
+ config.stdinarg = 0;
config.auth = NULL;
config.historyfile = NULL;
+ config.tty = isatty(fileno(stdout)) || (getenv("FAKETTY") != NULL);
+ config.mb_sep = '\n';
if (getenv("HOME") != NULL) {
config.historyfile = malloc(256);
@@ -505,19 +512,20 @@ int main(int argc, char **argv) {
if (config.auth != NULL) {
char *authargv[2];
+ int dbnum = config.dbnum;
+ /* We need to save the real configured database number and set it to
+ * zero here, otherwise cliSendCommand() will try to perform the
+ * SELECT command before the authentication, and it will fail. */
+ config.dbnum = 0;
authargv[0] = "AUTH";
authargv[1] = config.auth;
cliSendCommand(2, convertToSds(2, authargv), 1);
+ config.dbnum = dbnum; /* restore the right DB number */
}
- if (argc == 0 || config.interactive == 1) repl();
-
- argvcopy = convertToSds(argc+1, argv);
- if (config.argn_from_stdin) {
- sds lastarg = readArgFromStdin();
- argvcopy[argc] = lastarg;
- argc++;
- }
- return cliSendCommand(argc, argvcopy, config.repeat);
+ /* Start interactive mode when no command is provided */
+ if (argc == 0) repl();
+ /* Otherwise, we have some arguments to execute */
+ return noninteractive(argc,convertToSds(argc,argv));
}
diff --git a/src/redis.c b/src/redis.c
index 0e9b73b7..50cf2f6c 100644
--- a/src/redis.c
+++ b/src/redis.c
@@ -51,6 +51,7 @@
#include
#include
#include
+#include
/* Our shared "common" objects */
@@ -170,6 +171,7 @@ struct redisCommand readonlyCommandTable[] = {
{"info",infoCommand,1,REDIS_CMD_INLINE,NULL,0,0,0},
{"monitor",monitorCommand,1,REDIS_CMD_INLINE,NULL,0,0,0},
{"ttl",ttlCommand,2,REDIS_CMD_INLINE,NULL,1,1,1},
+ {"persist",persistCommand,2,REDIS_CMD_INLINE,NULL,1,1,1},
{"slaveof",slaveofCommand,3,REDIS_CMD_INLINE,NULL,0,0,0},
{"debug",debugCommand,-2,REDIS_CMD_INLINE,NULL,0,0,0},
{"config",configCommand,-2,REDIS_CMD_BULK,NULL,0,0,0},
@@ -338,7 +340,7 @@ dictType zsetDictType = {
NULL, /* val dup */
dictEncObjKeyCompare, /* key compare */
dictRedisObjectDestructor, /* key destructor */
- dictVanillaFree /* val destructor of malloc(sizeof(double)) */
+ NULL /* val destructor */
};
/* Db->dict, keys are sds strings, vals are Redis objects. */
@@ -435,6 +437,48 @@ void updateDictResizePolicy(void) {
/* ======================= Cron: called every 100 ms ======================== */
+/* Try to expire a few timed out keys. The algorithm used is adaptive and
+ * will use few CPU cycles if there are few expiring keys, otherwise
+ * it will get more aggressive to avoid that too much memory is used by
+ * keys that can be removed from the keyspace. */
+void activeExpireCycle(void) {
+ int j;
+
+ for (j = 0; j < server.dbnum; j++) {
+ int expired;
+ redisDb *db = server.db+j;
+
+ /* Continue to expire if at the end of the cycle more than 25%
+ * of the keys were expired. */
+ do {
+ long num = dictSize(db->expires);
+ time_t now = time(NULL);
+
+ expired = 0;
+ if (num > REDIS_EXPIRELOOKUPS_PER_CRON)
+ num = REDIS_EXPIRELOOKUPS_PER_CRON;
+ while (num--) {
+ dictEntry *de;
+ time_t t;
+
+ if ((de = dictGetRandomKey(db->expires)) == NULL) break;
+ t = (time_t) dictGetEntryVal(de);
+ if (now > t) {
+ sds key = dictGetEntryKey(de);
+ robj *keyobj = createStringObject(key,sdslen(key));
+
+ propagateExpire(db,keyobj);
+ dbDelete(db,keyobj);
+ decrRefCount(keyobj);
+ expired++;
+ server.stat_expiredkeys++;
+ }
+ }
+ } while (expired > REDIS_EXPIRELOOKUPS_PER_CRON/4);
+ }
+}
+
+
int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
int j, loops = server.cronloops++;
REDIS_NOTUSED(eventLoop);
@@ -533,41 +577,10 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
}
}
- /* Try to expire a few timed out keys. The algorithm used is adaptive and
- * will use few CPU cycles if there are few expiring keys, otherwise
- * it will get more aggressive to avoid that too much memory is used by
- * keys that can be removed from the keyspace. */
- for (j = 0; j < server.dbnum; j++) {
- int expired;
- redisDb *db = server.db+j;
-
- /* Continue to expire if at the end of the cycle more than 25%
- * of the keys were expired. */
- do {
- long num = dictSize(db->expires);
- time_t now = time(NULL);
-
- expired = 0;
- if (num > REDIS_EXPIRELOOKUPS_PER_CRON)
- num = REDIS_EXPIRELOOKUPS_PER_CRON;
- while (num--) {
- dictEntry *de;
- time_t t;
-
- if ((de = dictGetRandomKey(db->expires)) == NULL) break;
- t = (time_t) dictGetEntryVal(de);
- if (now > t) {
- sds key = dictGetEntryKey(de);
- robj *keyobj = createStringObject(key,sdslen(key));
-
- dbDelete(db,keyobj);
- decrRefCount(keyobj);
- expired++;
- server.stat_expiredkeys++;
- }
- }
- } while (expired > REDIS_EXPIRELOOKUPS_PER_CRON/4);
- }
+ /* Expire a few keys per cycle, only if this is a master.
+ * On slaves we wait for DEL operations synthesized by the master
+ * in order to guarantee a strict consistency. */
+ if (server.masterhost == NULL) activeExpireCycle();
/* Swap a few keys on disk if we are over the memory limit and VM
* is enbled. Try to free objects from the free list first. */
@@ -734,6 +747,7 @@ void initServerConfig() {
server.hash_max_zipmap_value = REDIS_HASH_MAX_ZIPMAP_VALUE;
server.list_max_ziplist_entries = REDIS_LIST_MAX_ZIPLIST_ENTRIES;
server.list_max_ziplist_value = REDIS_LIST_MAX_ZIPLIST_VALUE;
+ server.set_max_intset_entries = REDIS_SET_MAX_INTSET_ENTRIES;
server.shutdown_asap = 0;
resetServerSaveParams();
@@ -892,9 +906,6 @@ void call(redisClient *c, struct redisCommand *cmd) {
int processCommand(redisClient *c) {
struct redisCommand *cmd;
- /* Free some memory if needed (maxmemory setting) */
- if (server.maxmemory) freeMemoryIfNeeded();
-
/* Handle the multi bulk command type. This is an alternative protocol
* supported by Redis in order to receive commands that are composed of
* multiple binary-safe "bulk" arguments. The latency of processing is
@@ -913,15 +924,20 @@ int processCommand(redisClient *c) {
} else if (c->multibulk) {
if (c->bulklen == -1) {
if (((char*)c->argv[0]->ptr)[0] != '$') {
- addReplySds(c,sdsnew("-ERR multi bulk protocol error\r\n"));
+ addReplyError(c,"multi bulk protocol error");
resetClient(c);
return 1;
} else {
- int bulklen = atoi(((char*)c->argv[0]->ptr)+1);
+ char *eptr;
+ long bulklen = strtol(((char*)c->argv[0]->ptr)+1,&eptr,10);
+ int perr = eptr[0] != '\0';
+
decrRefCount(c->argv[0]);
- if (bulklen < 0 || bulklen > 1024*1024*1024) {
+ if (perr || bulklen == LONG_MIN || bulklen == LONG_MAX ||
+ bulklen < 0 || bulklen > 1024*1024*1024)
+ {
c->argc--;
- addReplySds(c,sdsnew("-ERR invalid bulk write count\r\n"));
+ addReplyError(c,"invalid bulk write count");
resetClient(c);
return 1;
}
@@ -974,27 +990,28 @@ int processCommand(redisClient *c) {
* such wrong arity, bad command name and so forth. */
cmd = lookupCommand(c->argv[0]->ptr);
if (!cmd) {
- addReplySds(c,
- sdscatprintf(sdsempty(), "-ERR unknown command '%s'\r\n",
- (char*)c->argv[0]->ptr));
+ addReplyErrorFormat(c,"unknown command '%s'",
+ (char*)c->argv[0]->ptr);
resetClient(c);
return 1;
} else if ((cmd->arity > 0 && cmd->arity != c->argc) ||
(c->argc < -cmd->arity)) {
- addReplySds(c,
- sdscatprintf(sdsempty(),
- "-ERR wrong number of arguments for '%s' command\r\n",
- cmd->name));
+ addReplyErrorFormat(c,"wrong number of arguments for '%s' command",
+ cmd->name);
resetClient(c);
return 1;
} else if (cmd->flags & REDIS_CMD_BULK && c->bulklen == -1) {
/* This is a bulk command, we have to read the last argument yet. */
- int bulklen = atoi(c->argv[c->argc-1]->ptr);
+ char *eptr;
+ long bulklen = strtol(c->argv[c->argc-1]->ptr,&eptr,10);
+ int perr = eptr[0] != '\0';
decrRefCount(c->argv[c->argc-1]);
- if (bulklen < 0 || bulklen > 1024*1024*1024) {
+ if (perr || bulklen == LONG_MAX || bulklen == LONG_MIN ||
+ bulklen < 0 || bulklen > 1024*1024*1024)
+ {
c->argc--;
- addReplySds(c,sdsnew("-ERR invalid bulk write count\r\n"));
+ addReplyError(c,"invalid bulk write count");
resetClient(c);
return 1;
}
@@ -1021,16 +1038,21 @@ int processCommand(redisClient *c) {
/* Check if the user is authenticated */
if (server.requirepass && !c->authenticated && cmd->proc != authCommand) {
- addReplySds(c,sdsnew("-ERR operation not permitted\r\n"));
+ addReplyError(c,"operation not permitted");
resetClient(c);
return 1;
}
- /* Handle the maxmemory directive */
+ /* Handle the maxmemory directive.
+ *
+ * First we try to free some memory if possible (if there are volatile
+ * keys in the dataset). If there are not the only thing we can do
+ * is returning an error. */
+ if (server.maxmemory) freeMemoryIfNeeded();
if (server.maxmemory && (cmd->flags & REDIS_CMD_DENYOOM) &&
zmalloc_used_memory() > server.maxmemory)
{
- addReplySds(c,sdsnew("-ERR command not allowed when used memory > 'maxmemory'\r\n"));
+ addReplyError(c,"command not allowed when used memory > 'maxmemory'");
resetClient(c);
return 1;
}
@@ -1040,7 +1062,7 @@ int processCommand(redisClient *c) {
&&
cmd->proc != subscribeCommand && cmd->proc != unsubscribeCommand &&
cmd->proc != psubscribeCommand && cmd->proc != punsubscribeCommand) {
- addReplySds(c,sdsnew("-ERR only (P)SUBSCRIBE / (P)UNSUBSCRIBE / QUIT allowed in this context\r\n"));
+ addReplyError(c,"only (P)SUBSCRIBE / (P)UNSUBSCRIBE / QUIT allowed in this context");
resetClient(c);
return 1;
}
@@ -1081,11 +1103,7 @@ int prepareForShutdown() {
if (server.vm_enabled) unlink(server.vm_swap_file);
} else {
/* Snapshotting. Perform a SYNC SAVE and exit */
- if (rdbSave(server.dbfilename) == REDIS_OK) {
- if (server.daemonize)
- unlink(server.pidfile);
- redisLog(REDIS_WARNING,"%zu bytes used at exit",zmalloc_used_memory());
- } else {
+ if (rdbSave(server.dbfilename) != REDIS_OK) {
/* Ooops.. error saving! The best we can do is to continue
* operating. Note that if there was a background saving process,
* in the next cron() Redis will be notified that the background
@@ -1095,6 +1113,7 @@ int prepareForShutdown() {
return REDIS_ERR;
}
}
+ if (server.daemonize) unlink(server.pidfile);
redisLog(REDIS_WARNING,"Server exit now, bye bye...");
return REDIS_OK;
}
@@ -1107,7 +1126,7 @@ void authCommand(redisClient *c) {
addReply(c,shared.ok);
} else {
c->authenticated = 0;
- addReplySds(c,sdscatprintf(sdsempty(),"-ERR invalid password\r\n"));
+ addReplyError(c,"invalid password");
}
}
@@ -1148,6 +1167,10 @@ sds genRedisInfoString(void) {
time_t uptime = time(NULL)-server.stat_starttime;
int j;
char hmem[64];
+ struct rusage self_ru, c_ru;
+
+ getrusage(RUSAGE_SELF, &self_ru);
+ getrusage(RUSAGE_CHILDREN, &c_ru);
bytesToHuman(hmem,zmalloc_used_memory());
info = sdscatprintf(sdsempty(),
@@ -1159,11 +1182,16 @@ sds genRedisInfoString(void) {
"process_id:%ld\r\n"
"uptime_in_seconds:%ld\r\n"
"uptime_in_days:%ld\r\n"
+ "used_cpu_sys:%.2f\r\n"
+ "used_cpu_user:%.2f\r\n"
+ "used_cpu_sys_childrens:%.2f\r\n"
+ "used_cpu_user_childrens:%.2f\r\n"
"connected_clients:%d\r\n"
"connected_slaves:%d\r\n"
"blocked_clients:%d\r\n"
"used_memory:%zu\r\n"
"used_memory_human:%s\r\n"
+ "mem_fragmentation_ratio:%.2f\r\n"
"changes_since_last_save:%lld\r\n"
"bgsave_in_progress:%d\r\n"
"last_save_time:%ld\r\n"
@@ -1185,11 +1213,16 @@ sds genRedisInfoString(void) {
(long) getpid(),
uptime,
uptime/(3600*24),
+ (float)self_ru.ru_utime.tv_sec+(float)self_ru.ru_utime.tv_usec/1000000,
+ (float)self_ru.ru_stime.tv_sec+(float)self_ru.ru_stime.tv_usec/1000000,
+ (float)c_ru.ru_utime.tv_sec+(float)c_ru.ru_utime.tv_usec/1000000,
+ (float)c_ru.ru_stime.tv_sec+(float)c_ru.ru_stime.tv_usec/1000000,
listLength(server.clients)-listLength(server.slaves),
listLength(server.slaves),
server.blpop_blocked_clients,
zmalloc_used_memory(),
hmem,
+ zmalloc_get_fragmentation_ratio(),
server.dirty,
server.bgsavechildpid != -1,
server.lastsave,
@@ -1319,7 +1352,8 @@ void freeMemoryIfNeeded(void) {
if (tryFreeOneObjectFromFreelist() == REDIS_OK) continue;
for (j = 0; j < server.dbnum; j++) {
int minttl = -1;
- robj *minkey = NULL;
+ sds minkey = NULL;
+ robj *keyobj = NULL;
struct dictEntry *de;
if (dictSize(server.db[j].expires)) {
@@ -1336,7 +1370,10 @@ void freeMemoryIfNeeded(void) {
minttl = t;
}
}
- dbDelete(server.db+j,minkey);
+ keyobj = createStringObject(minkey,sdslen(minkey));
+ dbDelete(server.db+j,keyobj);
+ server.stat_expiredkeys++;
+ decrRefCount(keyobj);
}
}
if (!freed) return; /* nothing to free... */
@@ -1367,9 +1404,17 @@ void linuxOvercommitMemoryWarning(void) {
}
#endif /* __linux__ */
+void createPidFile(void) {
+ /* Try to write the pid file in a best-effort way. */
+ FILE *fp = fopen(server.pidfile,"w");
+ if (fp) {
+ fprintf(fp,"%d\n",getpid());
+ fclose(fp);
+ }
+}
+
void daemonize(void) {
int fd;
- FILE *fp;
if (fork() != 0) exit(0); /* parent exits */
setsid(); /* create a new session */
@@ -1383,12 +1428,6 @@ void daemonize(void) {
dup2(fd, STDERR_FILENO);
if (fd > STDERR_FILENO) close(fd);
}
- /* Try to write the pid file */
- fp = fopen(server.pidfile,"w");
- if (fp) {
- fprintf(fp,"%d\n",getpid());
- fclose(fp);
- }
}
void version() {
@@ -1421,6 +1460,7 @@ int main(int argc, char **argv) {
}
if (server.daemonize) daemonize();
initServer();
+ if (server.daemonize) createPidFile();
redisLog(REDIS_NOTICE,"Server started, Redis version " REDIS_VERSION);
#ifdef __linux__
linuxOvercommitMemoryWarning();
@@ -1500,6 +1540,7 @@ void segvHandler(int sig, siginfo_t *info, void *secret) {
redisLog(REDIS_WARNING,"%s", messages[i]);
/* free(messages); Don't call free() with possibly corrupted memory. */
+ if (server.daemonize) unlink(server.pidfile);
_exit(0);
}
diff --git a/src/redis.h b/src/redis.h
index 38f0c140..8e05a4d4 100644
--- a/src/redis.h
+++ b/src/redis.h
@@ -26,6 +26,7 @@
#include "anet.h" /* Networking the easy way */
#include "zipmap.h" /* Compact string -> string data structure */
#include "ziplist.h" /* Compact list data structure */
+#include "intset.h" /* Compact integer set structure */
#include "version.h"
/* Error codes */
@@ -46,6 +47,7 @@
#define REDIS_MAX_WRITE_PER_EVENT (1024*64)
#define REDIS_REQUEST_MAX_SIZE (1024*1024*256) /* max bytes in inline command */
#define REDIS_SHARED_INTEGERS 10000
+#define REDIS_REPLY_CHUNK_BYTES (5*1500) /* 5 TCP packets with default MTU */
/* If more then REDIS_WRITEV_THRESHOLD write packets are pending use writev */
#define REDIS_WRITEV_THRESHOLD 3
@@ -82,6 +84,7 @@
#define REDIS_ENCODING_ZIPMAP 3 /* Encoded as zipmap */
#define REDIS_ENCODING_LINKEDLIST 4 /* Encoded as regular linked list */
#define REDIS_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
+#define REDIS_ENCODING_INTSET 6 /* Encoded as intset */
/* Object types only used for dumping to disk */
#define REDIS_EXPIRETIME 253
@@ -188,6 +191,7 @@
#define REDIS_HASH_MAX_ZIPMAP_VALUE 512
#define REDIS_LIST_MAX_ZIPLIST_ENTRIES 1024
#define REDIS_LIST_MAX_ZIPLIST_VALUE 32
+#define REDIS_SET_MAX_INTSET_ENTRIES 4096
/* Sets operations codes */
#define REDIS_OP_UNION 0
@@ -282,8 +286,9 @@ typedef struct redisClient {
int dictid;
sds querybuf;
robj **argv, **mbargv;
+ char *newline; /* pointing to the detected newline in querybuf */
int argc, mbargc;
- int bulklen; /* bulk read len. -1 if not in bulk read mode */
+ long bulklen; /* bulk read len. -1 if not in bulk read mode */
int multibulk; /* multi bulk command format active */
list *reply;
int sentlen;
@@ -306,6 +311,10 @@ typedef struct redisClient {
list *watched_keys; /* Keys WATCHED for MULTI/EXEC CAS */
dict *pubsub_channels; /* channels a client is interested in (SUBSCRIBE) */
list *pubsub_patterns; /* patterns a client is interested in (SUBSCRIBE) */
+
+ /* Response buffer */
+ int bufpos;
+ char buf[REDIS_REPLY_CHUNK_BYTES];
} redisClient;
struct saveparam {
@@ -335,6 +344,7 @@ struct redisServer {
int sofd;
redisDb *db;
long long dirty; /* changes to DB from the last save */
+ long long dirty_before_bgsave; /* used to restore dirty on failed BGSAVE */
list *clients;
list *slaves, *monitors;
char neterr[ANET_ERR_LEN];
@@ -400,6 +410,7 @@ struct redisServer {
size_t hash_max_zipmap_value;
size_t list_max_ziplist_entries;
size_t list_max_ziplist_value;
+ size_t set_max_intset_entries;
/* Virtual memory state */
FILE *vm_fp;
int vm_fd;
@@ -482,13 +493,14 @@ typedef struct _redisSortOperation {
} redisSortOperation;
/* ZSETs use a specialized version of Skiplists */
-
typedef struct zskiplistNode {
- struct zskiplistNode **forward;
- struct zskiplistNode *backward;
- unsigned int *span;
- double score;
robj *obj;
+ double score;
+ struct zskiplistNode *backward;
+ struct zskiplistLevel {
+ struct zskiplistNode *forward;
+ unsigned int span;
+ } level[];
} zskiplistNode;
typedef struct zskiplist {
@@ -537,6 +549,14 @@ typedef struct {
listNode *ln; /* Entry in linked list */
} listTypeEntry;
+/* Structure to hold set iteration abstraction. */
+typedef struct {
+ robj *subject;
+ int encoding;
+ int ii; /* intset iterator */
+ dictIterator *di;
+} setTypeIterator;
+
/* Structure to hold hash iteration abstration. Note that iteration over
* hashes involves both fields and values. Because it is possible that
* not both are required, store pointers in the iterator to avoid
@@ -577,6 +597,8 @@ void resetClient(redisClient *c);
void sendReplyToClient(aeEventLoop *el, int fd, void *privdata, int mask);
void sendReplyToClientWritev(aeEventLoop *el, int fd, void *privdata, int mask);
void addReply(redisClient *c, robj *obj);
+void *addDeferredMultiBulkLength(redisClient *c);
+void setDeferredMultiBulkLength(redisClient *c, void *node, long length);
void addReplySds(redisClient *c, sds s);
void processInputBuffer(redisClient *c);
void acceptTcpHandler(aeEventLoop *el, int fd, void *privdata, int mask);
@@ -587,11 +609,23 @@ void addReplyBulkCString(redisClient *c, char *s);
void acceptHandler(aeEventLoop *el, int fd, void *privdata, int mask);
void addReply(redisClient *c, robj *obj);
void addReplySds(redisClient *c, sds s);
+void addReplyError(redisClient *c, char *err);
+void addReplyStatus(redisClient *c, char *status);
void addReplyDouble(redisClient *c, double d);
void addReplyLongLong(redisClient *c, long long ll);
-void addReplyUlong(redisClient *c, unsigned long ul);
+void addReplyMultiBulkLen(redisClient *c, long length);
void *dupClientReplyValue(void *o);
+#ifdef __GNUC__
+void addReplyErrorFormat(redisClient *c, const char *fmt, ...)
+ __attribute__((format(printf, 2, 3)));
+void addReplyStatusFormat(redisClient *c, const char *fmt, ...)
+ __attribute__((format(printf, 2, 3)));
+#else
+void addReplyErrorFormat(redisClient *c, const char *fmt, ...);
+void addReplyStatusFormat(redisClient *c, const char *fmt, ...);
+#endif
+
/* List data type */
void listTypeTryConversion(robj *subject, robj *value);
void listTypePush(robj *subject, robj *value, int where);
@@ -636,6 +670,7 @@ robj *createStringObjectFromLongLong(long long value);
robj *createListObject(void);
robj *createZiplistObject(void);
robj *createSetObject(void);
+robj *createIntsetObject(void);
robj *createHashObject(void);
robj *createZsetObject(void);
int getLongFromObjectOrReply(redisClient *c, robj *o, long *target, const char *msg);
@@ -677,7 +712,7 @@ void backgroundRewriteDoneHandler(int statloc);
/* Sorted sets data type */
zskiplist *zslCreate(void);
void zslFree(zskiplist *zsl);
-void zslInsert(zskiplist *zsl, double score, robj *obj);
+zskiplistNode *zslInsert(zskiplist *zsl, double score, robj *obj);
/* Core functions */
void freeMemoryIfNeeded(void);
@@ -719,6 +754,18 @@ int dontWaitForSwappedKey(redisClient *c, robj *key);
void handleClientsBlockedOnSwappedKey(redisDb *db, robj *key);
vmpointer *vmSwapObjectBlocking(robj *val);
+/* Set data type */
+robj *setTypeCreate(robj *value);
+int setTypeAdd(robj *subject, robj *value);
+int setTypeRemove(robj *subject, robj *value);
+int setTypeIsMember(robj *subject, robj *value);
+setTypeIterator *setTypeInitIterator(robj *subject);
+void setTypeReleaseIterator(setTypeIterator *si);
+robj *setTypeNext(setTypeIterator *si);
+robj *setTypeRandomElement(robj *subject);
+unsigned long setTypeSize(robj *subject);
+void setTypeConvert(robj *subject, int enc);
+
/* Hash data type */
void convertToRealHash(robj *o);
void hashTypeTryConversion(robj *subject, robj **argv, int start, int end);
@@ -747,6 +794,8 @@ int stringmatch(const char *pattern, const char *string, int nocase);
long long memtoll(const char *p, int *err);
int ll2string(char *s, size_t len, long long value);
int isStringRepresentableAsLong(sds s, long *longval);
+int isStringRepresentableAsLongLong(sds s, long long *longval);
+int isObjectRepresentableAsLongLong(robj *o, long long *llongval);
/* Configuration */
void loadServerConfig(char *filename);
@@ -755,10 +804,10 @@ void resetServerSaveParams();
/* db.c -- Keyspace access API */
int removeExpire(redisDb *db, robj *key);
+void propagateExpire(redisDb *db, robj *key);
int expireIfNeeded(redisDb *db, robj *key);
-int deleteIfVolatile(redisDb *db, robj *key);
time_t getExpire(redisDb *db, robj *key);
-int setExpire(redisDb *db, robj *key, time_t when);
+void setExpire(redisDb *db, robj *key, time_t when);
robj *lookupKey(redisDb *db, robj *key);
robj *lookupKeyRead(redisDb *db, robj *key);
robj *lookupKeyWrite(redisDb *db, robj *key);
@@ -841,6 +890,7 @@ void expireCommand(redisClient *c);
void expireatCommand(redisClient *c);
void getsetCommand(redisClient *c);
void ttlCommand(redisClient *c);
+void persistCommand(redisClient *c);
void slaveofCommand(redisClient *c);
void debugCommand(redisClient *c);
void msetCommand(redisClient *c);
diff --git a/src/replication.c b/src/replication.c
index 5387db91..8c629006 100644
--- a/src/replication.c
+++ b/src/replication.c
@@ -138,7 +138,7 @@ int syncRead(int fd, char *ptr, ssize_t size, int timeout) {
while(size) {
if (aeWait(fd,AE_READABLE,1000) & AE_READABLE) {
nread = read(fd,ptr,size);
- if (nread == -1) return -1;
+ if (nread <= 0) return -1;
ptr += nread;
size -= nread;
totread += nread;
@@ -176,12 +176,19 @@ void syncCommand(redisClient *c) {
/* ignore SYNC if aleady slave or in monitor mode */
if (c->flags & REDIS_SLAVE) return;
+ /* Refuse SYNC requests if we are a slave but the link with our master
+ * is not ok... */
+ if (server.masterhost && server.replstate != REDIS_REPL_CONNECTED) {
+ addReplyError(c,"Can't SYNC while not connected with my master");
+ return;
+ }
+
/* SYNC can't be issued when the server has pending data to send to
* the client about already issued commands. We need a fresh reply
* buffer registering the differences between the BGSAVE and the current
* dataset, so that we can copy to other slaves if needed. */
if (listLength(c->reply) != 0) {
- addReplySds(c,sdsnew("-ERR SYNC is invalid with pending input\r\n"));
+ addReplyError(c,"SYNC is invalid with pending input");
return;
}
@@ -219,7 +226,7 @@ void syncCommand(redisClient *c) {
redisLog(REDIS_NOTICE,"Starting BGSAVE for SYNC");
if (rdbSaveBackground(server.dbfilename) != REDIS_OK) {
redisLog(REDIS_NOTICE,"Replication failed, can't BGSAVE");
- addReplySds(c,sdsnew("-ERR Unalbe to perform background save\r\n"));
+ addReplyError(c,"Unable to perform background save");
return;
}
c->replstate = REDIS_REPL_WAIT_BGSAVE_END;
@@ -392,7 +399,12 @@ int syncWithMaster(void) {
strerror(errno));
return REDIS_ERR;
}
- if (buf[0] != '$') {
+ if (buf[0] == '-') {
+ close(fd);
+ redisLog(REDIS_WARNING,"MASTER aborted replication with an error: %s",
+ buf+1);
+ return REDIS_ERR;
+ } else if (buf[0] != '$') {
close(fd);
redisLog(REDIS_WARNING,"Bad protocol from MASTER, the first byte is not '$', are you sure the host and port are right?");
return REDIS_ERR;
@@ -416,9 +428,9 @@ int syncWithMaster(void) {
int nread, nwritten;
nread = read(fd,buf,(dumpsize < 1024)?dumpsize:1024);
- if (nread == -1) {
+ if (nread <= 0) {
redisLog(REDIS_WARNING,"I/O error trying to sync with MASTER: %s",
- strerror(errno));
+ (nread == -1) ? strerror(errno) : "connection lost");
close(fd);
close(dfd);
return REDIS_ERR;
diff --git a/src/sds.c b/src/sds.c
index 5e67f044..2d063c4a 100644
--- a/src/sds.c
+++ b/src/sds.c
@@ -33,7 +33,6 @@
#include "sds.h"
#include
#include
-#include
#include
#include
#include "zmalloc.h"
@@ -156,8 +155,8 @@ sds sdscpy(sds s, char *t) {
return sdscpylen(s, t, strlen(t));
}
-sds sdscatprintf(sds s, const char *fmt, ...) {
- va_list ap;
+sds sdscatvprintf(sds s, const char *fmt, va_list ap) {
+ va_list cpy;
char *buf, *t;
size_t buflen = 16;
@@ -169,9 +168,8 @@ sds sdscatprintf(sds s, const char *fmt, ...) {
if (buf == NULL) return NULL;
#endif
buf[buflen-2] = '\0';
- va_start(ap, fmt);
- vsnprintf(buf, buflen, fmt, ap);
- va_end(ap);
+ va_copy(cpy,ap);
+ vsnprintf(buf, buflen, fmt, cpy);
if (buf[buflen-2] != '\0') {
zfree(buf);
buflen *= 2;
@@ -184,6 +182,15 @@ sds sdscatprintf(sds s, const char *fmt, ...) {
return t;
}
+sds sdscatprintf(sds s, const char *fmt, ...) {
+ va_list ap;
+ char *t;
+ va_start(ap, fmt);
+ t = sdscatvprintf(s,fmt,ap);
+ va_end(ap);
+ return t;
+}
+
sds sdstrim(sds s, const char *cset) {
struct sdshdr *sh = (void*) (s-(sizeof(struct sdshdr)));
char *start, *end, *sp, *ep;
@@ -216,13 +223,16 @@ sds sdsrange(sds s, int start, int end) {
}
newlen = (start > end) ? 0 : (end-start)+1;
if (newlen != 0) {
- if (start >= (signed)len) start = len-1;
- if (end >= (signed)len) end = len-1;
- newlen = (start > end) ? 0 : (end-start)+1;
+ if (start >= (signed)len) {
+ newlen = 0;
+ } else if (end >= (signed)len) {
+ end = len-1;
+ newlen = (start > end) ? 0 : (end-start)+1;
+ }
} else {
start = 0;
}
- if (start != 0) memmove(sh->buf, sh->buf+start, newlen);
+ if (start && newlen) memmove(sh->buf, sh->buf+start, newlen);
sh->buf[newlen] = 0;
sh->free = sh->free+(sh->len-newlen);
sh->len = newlen;
@@ -382,3 +392,182 @@ sds sdscatrepr(sds s, char *p, size_t len) {
}
return sdscatlen(s,"\"",1);
}
+
+/* Split a line into arguments, where every argument can be in the
+ * following programming-language REPL-alike form:
+ *
+ * foo bar "newline are supported\n" and "\xff\x00otherstuff"
+ *
+ * The number of arguments is stored into *argc, and an array
+ * of sds is returned. The caller should sdsfree() all the returned
+ * strings and finally zfree() the array itself.
+ *
+ * Note that sdscatrepr() is able to convert back a string into
+ * a quoted string in the same format sdssplitargs() is able to parse.
+ */
+sds *sdssplitargs(char *line, int *argc) {
+ char *p = line;
+ char *current = NULL;
+ char **vector = NULL;
+
+ *argc = 0;
+ while(1) {
+ /* skip blanks */
+ while(*p && isspace(*p)) p++;
+ if (*p) {
+ /* get a token */
+ int inq=0; /* set to 1 if we are in "quotes" */
+ int done=0;
+
+ if (current == NULL) current = sdsempty();
+ while(!done) {
+ if (inq) {
+ if (*p == '\\' && *(p+1)) {
+ char c;
+
+ p++;
+ switch(*p) {
+ case 'n': c = '\n'; break;
+ case 'r': c = '\r'; break;
+ case 't': c = '\t'; break;
+ case 'b': c = '\b'; break;
+ case 'a': c = '\a'; break;
+ default: c = *p; break;
+ }
+ current = sdscatlen(current,&c,1);
+ } else if (*p == '"') {
+ /* closing quote must be followed by a space */
+ if (*(p+1) && !isspace(*(p+1))) goto err;
+ done=1;
+ } else if (!*p) {
+ /* unterminated quotes */
+ goto err;
+ } else {
+ current = sdscatlen(current,p,1);
+ }
+ } else {
+ switch(*p) {
+ case ' ':
+ case '\n':
+ case '\r':
+ case '\t':
+ case '\0':
+ done=1;
+ break;
+ case '"':
+ inq=1;
+ break;
+ default:
+ current = sdscatlen(current,p,1);
+ break;
+ }
+ }
+ if (*p) p++;
+ }
+ /* add the token to the vector */
+ vector = zrealloc(vector,((*argc)+1)*sizeof(char*));
+ vector[*argc] = current;
+ (*argc)++;
+ current = NULL;
+ } else {
+ return vector;
+ }
+ }
+
+err:
+ while((*argc)--)
+ sdsfree(vector[*argc]);
+ zfree(vector);
+ if (current) sdsfree(current);
+ return NULL;
+}
+
+#ifdef SDS_TEST_MAIN
+#include
+#include "testhelp.h"
+
+int main(void) {
+ {
+ sds x = sdsnew("foo"), y;
+
+ test_cond("Create a string and obtain the length",
+ sdslen(x) == 3 && memcmp(x,"foo\0",4) == 0)
+
+ sdsfree(x);
+ x = sdsnewlen("foo",2);
+ test_cond("Create a string with specified length",
+ sdslen(x) == 2 && memcmp(x,"fo\0",3) == 0)
+
+ x = sdscat(x,"bar");
+ test_cond("Strings concatenation",
+ sdslen(x) == 5 && memcmp(x,"fobar\0",6) == 0);
+
+ x = sdscpy(x,"a");
+ test_cond("sdscpy() against an originally longer string",
+ sdslen(x) == 1 && memcmp(x,"a\0",2) == 0)
+
+ x = sdscpy(x,"xyzxxxxxxxxxxyyyyyyyyyykkkkkkkkkk");
+ test_cond("sdscpy() against an originally shorter string",
+ sdslen(x) == 33 &&
+ memcmp(x,"xyzxxxxxxxxxxyyyyyyyyyykkkkkkkkkk\0",33) == 0)
+
+ sdsfree(x);
+ x = sdscatprintf(sdsempty(),"%d",123);
+ test_cond("sdscatprintf() seems working in the base case",
+ sdslen(x) == 3 && memcmp(x,"123\0",4) ==0)
+
+ sdsfree(x);
+ x = sdstrim(sdsnew("xxciaoyyy"),"xy");
+ test_cond("sdstrim() correctly trims characters",
+ sdslen(x) == 4 && memcmp(x,"ciao\0",5) == 0)
+
+ y = sdsrange(sdsdup(x),1,1);
+ test_cond("sdsrange(...,1,1)",
+ sdslen(y) == 1 && memcmp(y,"i\0",2) == 0)
+
+ sdsfree(y);
+ y = sdsrange(sdsdup(x),1,-1);
+ test_cond("sdsrange(...,1,-1)",
+ sdslen(y) == 3 && memcmp(y,"iao\0",4) == 0)
+
+ sdsfree(y);
+ y = sdsrange(sdsdup(x),-2,-1);
+ test_cond("sdsrange(...,-2,-1)",
+ sdslen(y) == 2 && memcmp(y,"ao\0",3) == 0)
+
+ sdsfree(y);
+ y = sdsrange(sdsdup(x),2,1);
+ test_cond("sdsrange(...,2,1)",
+ sdslen(y) == 0 && memcmp(y,"\0",1) == 0)
+
+ sdsfree(y);
+ y = sdsrange(sdsdup(x),1,100);
+ test_cond("sdsrange(...,1,100)",
+ sdslen(y) == 3 && memcmp(y,"iao\0",4) == 0)
+
+ sdsfree(y);
+ y = sdsrange(sdsdup(x),100,100);
+ test_cond("sdsrange(...,100,100)",
+ sdslen(y) == 0 && memcmp(y,"\0",1) == 0)
+
+ sdsfree(y);
+ sdsfree(x);
+ x = sdsnew("foo");
+ y = sdsnew("foa");
+ test_cond("sdscmp(foo,foa)", sdscmp(x,y) > 0)
+
+ sdsfree(y);
+ sdsfree(x);
+ x = sdsnew("bar");
+ y = sdsnew("bar");
+ test_cond("sdscmp(bar,bar)", sdscmp(x,y) == 0)
+
+ sdsfree(y);
+ sdsfree(x);
+ x = sdsnew("aar");
+ y = sdsnew("bar");
+ test_cond("sdscmp(bar,bar)", sdscmp(x,y) < 0)
+ }
+ test_report()
+}
+#endif
diff --git a/src/sds.h b/src/sds.h
index ef3a418f..ae0f84fb 100644
--- a/src/sds.h
+++ b/src/sds.h
@@ -32,6 +32,7 @@
#define __SDS_H
#include
+#include
typedef char *sds;
@@ -53,6 +54,7 @@ sds sdscat(sds s, char *t);
sds sdscpylen(sds s, char *t, size_t len);
sds sdscpy(sds s, char *t);
+sds sdscatvprintf(sds s, const char *fmt, va_list ap);
#ifdef __GNUC__
sds sdscatprintf(sds s, const char *fmt, ...)
__attribute__((format(printf, 2, 3)));
@@ -70,5 +72,6 @@ void sdstolower(sds s);
void sdstoupper(sds s);
sds sdsfromlonglong(long long value);
sds sdscatrepr(sds s, char *p, size_t len);
+sds *sdssplitargs(char *line, int *argc);
#endif
diff --git a/src/solarisfixes.h b/src/solarisfixes.h
index ce8e7b6f..3cb091d4 100644
--- a/src/solarisfixes.h
+++ b/src/solarisfixes.h
@@ -1,6 +1,7 @@
/* Solaris specific fixes */
#if defined(__GNUC__)
+#include
#undef isnan
#define isnan(x) \
__extension__({ __typeof (x) __x_a = (x); \
diff --git a/src/sort.c b/src/sort.c
index 4295a6ec..79f79010 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -202,7 +202,7 @@ void sortCommand(redisClient *c) {
/* Load the sorting vector with all the objects to sort */
switch(sortval->type) {
case REDIS_LIST: vectorlen = listTypeLength(sortval); break;
- case REDIS_SET: vectorlen = dictSize((dict*)sortval->ptr); break;
+ case REDIS_SET: vectorlen = setTypeSize(sortval); break;
case REDIS_ZSET: vectorlen = dictSize(((zset*)sortval->ptr)->dict); break;
default: vectorlen = 0; redisPanic("Bad SORT type"); /* Avoid GCC warning */
}
@@ -219,18 +219,20 @@ void sortCommand(redisClient *c) {
j++;
}
listTypeReleaseIterator(li);
- } else {
- dict *set;
+ } else if (sortval->type == REDIS_SET) {
+ setTypeIterator *si = setTypeInitIterator(sortval);
+ robj *ele;
+ while((ele = setTypeNext(si)) != NULL) {
+ vector[j].obj = ele;
+ vector[j].u.score = 0;
+ vector[j].u.cmpobj = NULL;
+ j++;
+ }
+ setTypeReleaseIterator(si);
+ } else if (sortval->type == REDIS_ZSET) {
+ dict *set = ((zset*)sortval->ptr)->dict;
dictIterator *di;
dictEntry *setele;
-
- if (sortval->type == REDIS_SET) {
- set = sortval->ptr;
- } else {
- zset *zs = sortval->ptr;
- set = zs->dict;
- }
-
di = dictGetIterator(set);
while((setele = dictNext(di)) != NULL) {
vector[j].obj = dictGetEntryKey(setele);
@@ -239,6 +241,8 @@ void sortCommand(redisClient *c) {
j++;
}
dictReleaseIterator(di);
+ } else {
+ redisPanic("Unknown type");
}
redisAssert(j == vectorlen);
@@ -303,7 +307,7 @@ void sortCommand(redisClient *c) {
outputlen = getop ? getop*(end-start+1) : end-start+1;
if (storekey == NULL) {
/* STORE option not specified, sent the sorting result to client */
- addReplySds(c,sdscatprintf(sdsempty(),"*%d\r\n",outputlen));
+ addReplyMultiBulkLen(c,outputlen);
for (j = start; j <= end; j++) {
listNode *ln;
listIter li;
@@ -365,11 +369,11 @@ void sortCommand(redisClient *c) {
* replaced. */
server.dirty += 1+outputlen;
touchWatchedKey(c->db,storekey);
- addReplySds(c,sdscatprintf(sdsempty(),":%d\r\n",outputlen));
+ addReplyLongLong(c,outputlen);
}
/* Cleanup */
- if (sortval->type == REDIS_LIST)
+ if (sortval->type == REDIS_LIST || sortval->type == REDIS_SET)
for (j = 0; j < vectorlen; j++)
decrRefCount(vector[j].obj);
decrRefCount(sortval);
diff --git a/src/t_hash.c b/src/t_hash.c
index b6be284f..5cef1cab 100644
--- a/src/t_hash.c
+++ b/src/t_hash.c
@@ -249,7 +249,7 @@ void hmsetCommand(redisClient *c) {
robj *o;
if ((c->argc % 2) == 1) {
- addReplySds(c,sdsnew("-ERR wrong number of arguments for HMSET\r\n"));
+ addReplyError(c,"wrong number of arguments for HMSET");
return;
}
@@ -315,7 +315,7 @@ void hmgetCommand(redisClient *c) {
/* Note the check for o != NULL happens inside the loop. This is
* done because objects that cannot be found are considered to be
* an empty hash. The reply should then be a series of NULLs. */
- addReplySds(c,sdscatprintf(sdsempty(),"*%d\r\n",c->argc-2));
+ addReplyMultiBulkLen(c,c->argc-2);
for (i = 2; i < c->argc; i++) {
if (o != NULL && (value = hashTypeGet(o,c->argv[i])) != NULL) {
addReplyBulk(c,value);
@@ -346,21 +346,19 @@ void hlenCommand(redisClient *c) {
if ((o = lookupKeyReadOrReply(c,c->argv[1],shared.czero)) == NULL ||
checkType(c,o,REDIS_HASH)) return;
- addReplyUlong(c,hashTypeLength(o));
+ addReplyLongLong(c,hashTypeLength(o));
}
void genericHgetallCommand(redisClient *c, int flags) {
- robj *o, *lenobj, *obj;
+ robj *o, *obj;
unsigned long count = 0;
hashTypeIterator *hi;
+ void *replylen = NULL;
if ((o = lookupKeyReadOrReply(c,c->argv[1],shared.emptymultibulk)) == NULL
|| checkType(c,o,REDIS_HASH)) return;
- lenobj = createObject(REDIS_STRING,NULL);
- addReply(c,lenobj);
- decrRefCount(lenobj);
-
+ replylen = addDeferredMultiBulkLength(c);
hi = hashTypeInitIterator(o);
while (hashTypeNext(hi) != REDIS_ERR) {
if (flags & REDIS_HASH_KEY) {
@@ -377,8 +375,7 @@ void genericHgetallCommand(redisClient *c, int flags) {
}
}
hashTypeReleaseIterator(hi);
-
- lenobj->ptr = sdscatprintf(sdsempty(),"*%lu\r\n",count);
+ setDeferredMultiBulkLength(c,replylen,count);
}
void hkeysCommand(redisClient *c) {
diff --git a/src/t_list.c b/src/t_list.c
index 2a981033..41d651f6 100644
--- a/src/t_list.c
+++ b/src/t_list.c
@@ -342,7 +342,7 @@ void pushxGenericCommand(redisClient *c, robj *refval, robj *val, int where) {
server.dirty++;
}
- addReplyUlong(c,listTypeLength(subject));
+ addReplyLongLong(c,listTypeLength(subject));
}
void lpushxCommand(redisClient *c) {
@@ -366,7 +366,7 @@ void linsertCommand(redisClient *c) {
void llenCommand(redisClient *c) {
robj *o = lookupKeyReadOrReply(c,c->argv[1],shared.czero);
if (o == NULL || checkType(c,o,REDIS_LIST)) return;
- addReplyUlong(c,listTypeLength(o));
+ addReplyLongLong(c,listTypeLength(o));
}
void lindexCommand(redisClient *c) {
@@ -494,7 +494,7 @@ void lrangeCommand(redisClient *c) {
rangelen = (end-start)+1;
/* Return the result in form of a multi-bulk reply */
- addReplySds(c,sdscatprintf(sdsempty(),"*%d\r\n",rangelen));
+ addReplyMultiBulkLen(c,rangelen);
listTypeIterator *li = listTypeInitIterator(o,start,REDIS_TAIL);
for (j = 0; j < rangelen; j++) {
redisAssert(listTypeNext(li,&entry));
@@ -594,7 +594,7 @@ void lremCommand(redisClient *c) {
decrRefCount(obj);
if (listTypeLength(subject) == 0) dbDelete(c->db,c->argv[1]);
- addReplySds(c,sdscatprintf(sdsempty(),":%d\r\n",removed));
+ addReplyLongLong(c,removed);
if (removed) touchWatchedKey(c->db,c->argv[1]);
}
@@ -772,7 +772,7 @@ int handleClientsWaitingListPush(redisClient *c, robj *key, robj *ele) {
redisAssert(ln != NULL);
receiver = ln->value;
- addReplySds(receiver,sdsnew("*2\r\n"));
+ addReplyMultiBulkLen(receiver,2);
addReplyBulk(receiver,key);
addReplyBulk(receiver,ele);
unblockClientWaitingData(receiver);
@@ -782,9 +782,20 @@ int handleClientsWaitingListPush(redisClient *c, robj *key, robj *ele) {
/* Blocking RPOP/LPOP */
void blockingPopGenericCommand(redisClient *c, int where) {
robj *o;
+ long long lltimeout;
time_t timeout;
int j;
+ /* Make sure timeout is an integer value */
+ if (getLongLongFromObjectOrReply(c,c->argv[c->argc-1],&lltimeout,
+ "timeout is not an integer") != REDIS_OK) return;
+
+ /* Make sure the timeout is not negative */
+ if (lltimeout < 0) {
+ addReplyError(c,"timeout is negative");
+ return;
+ }
+
for (j = 1; j < c->argc-1; j++) {
o = lookupKeyWrite(c->db,c->argv[j]);
if (o != NULL) {
@@ -811,7 +822,7 @@ void blockingPopGenericCommand(redisClient *c, int where) {
* "real" command will add the last element (the value)
* for us. If this souds like an hack to you it's just
* because it is... */
- addReplySds(c,sdsnew("*2\r\n"));
+ addReplyMultiBulkLen(c,2);
addReplyBulk(c,argv[1]);
popGenericCommand(c,where);
@@ -823,8 +834,16 @@ void blockingPopGenericCommand(redisClient *c, int where) {
}
}
}
+
+ /* If we are inside a MULTI/EXEC and the list is empty the only thing
+ * we can do is treating it as a timeout (even with timeout 0). */
+ if (c->flags & REDIS_MULTI) {
+ addReply(c,shared.nullmultibulk);
+ return;
+ }
+
/* If the list is empty or the key does not exists we must block */
- timeout = strtol(c->argv[c->argc-1]->ptr,NULL,10);
+ timeout = lltimeout;
if (timeout > 0) timeout += time(NULL);
blockForKeys(c,c->argv+1,c->argc-2,timeout);
}
diff --git a/src/t_set.c b/src/t_set.c
index 94b97633..e2ac5ae5 100644
--- a/src/t_set.c
+++ b/src/t_set.c
@@ -4,12 +4,182 @@
* Set Commands
*----------------------------------------------------------------------------*/
+/* Factory method to return a set that *can* hold "value". When the object has
+ * an integer-encodable value, an intset will be returned. Otherwise a regular
+ * hash table. */
+robj *setTypeCreate(robj *value) {
+ if (isObjectRepresentableAsLongLong(value,NULL) == REDIS_OK)
+ return createIntsetObject();
+ return createSetObject();
+}
+
+int setTypeAdd(robj *subject, robj *value) {
+ long long llval;
+ if (subject->encoding == REDIS_ENCODING_HT) {
+ if (dictAdd(subject->ptr,value,NULL) == DICT_OK) {
+ incrRefCount(value);
+ return 1;
+ }
+ } else if (subject->encoding == REDIS_ENCODING_INTSET) {
+ if (isObjectRepresentableAsLongLong(value,&llval) == REDIS_OK) {
+ uint8_t success = 0;
+ subject->ptr = intsetAdd(subject->ptr,llval,&success);
+ if (success) {
+ /* Convert to regular set when the intset contains
+ * too many entries. */
+ if (intsetLen(subject->ptr) > server.set_max_intset_entries)
+ setTypeConvert(subject,REDIS_ENCODING_HT);
+ return 1;
+ }
+ } else {
+ /* Failed to get integer from object, convert to regular set. */
+ setTypeConvert(subject,REDIS_ENCODING_HT);
+
+ /* The set *was* an intset and this value is not integer
+ * encodable, so dictAdd should always work. */
+ redisAssert(dictAdd(subject->ptr,value,NULL) == DICT_OK);
+ incrRefCount(value);
+ return 1;
+ }
+ } else {
+ redisPanic("Unknown set encoding");
+ }
+ return 0;
+}
+
+int setTypeRemove(robj *subject, robj *value) {
+ long long llval;
+ if (subject->encoding == REDIS_ENCODING_HT) {
+ if (dictDelete(subject->ptr,value) == DICT_OK) {
+ if (htNeedsResize(subject->ptr)) dictResize(subject->ptr);
+ return 1;
+ }
+ } else if (subject->encoding == REDIS_ENCODING_INTSET) {
+ if (isObjectRepresentableAsLongLong(value,&llval) == REDIS_OK) {
+ uint8_t success;
+ subject->ptr = intsetRemove(subject->ptr,llval,&success);
+ if (success) return 1;
+ }
+ } else {
+ redisPanic("Unknown set encoding");
+ }
+ return 0;
+}
+
+int setTypeIsMember(robj *subject, robj *value) {
+ long long llval;
+ if (subject->encoding == REDIS_ENCODING_HT) {
+ return dictFind((dict*)subject->ptr,value) != NULL;
+ } else if (subject->encoding == REDIS_ENCODING_INTSET) {
+ if (isObjectRepresentableAsLongLong(value,&llval) == REDIS_OK) {
+ return intsetFind((intset*)subject->ptr,llval);
+ }
+ } else {
+ redisPanic("Unknown set encoding");
+ }
+ return 0;
+}
+
+setTypeIterator *setTypeInitIterator(robj *subject) {
+ setTypeIterator *si = zmalloc(sizeof(setTypeIterator));
+ si->subject = subject;
+ si->encoding = subject->encoding;
+ if (si->encoding == REDIS_ENCODING_HT) {
+ si->di = dictGetIterator(subject->ptr);
+ } else if (si->encoding == REDIS_ENCODING_INTSET) {
+ si->ii = 0;
+ } else {
+ redisPanic("Unknown set encoding");
+ }
+ return si;
+}
+
+void setTypeReleaseIterator(setTypeIterator *si) {
+ if (si->encoding == REDIS_ENCODING_HT)
+ dictReleaseIterator(si->di);
+ zfree(si);
+}
+
+/* Move to the next entry in the set. Returns the object at the current
+ * position, or NULL when the end is reached. This object will have its
+ * refcount incremented, so the caller needs to take care of this. */
+robj *setTypeNext(setTypeIterator *si) {
+ robj *ret = NULL;
+ if (si->encoding == REDIS_ENCODING_HT) {
+ dictEntry *de = dictNext(si->di);
+ if (de != NULL) {
+ ret = dictGetEntryKey(de);
+ incrRefCount(ret);
+ }
+ } else if (si->encoding == REDIS_ENCODING_INTSET) {
+ int64_t llval;
+ if (intsetGet(si->subject->ptr,si->ii++,&llval))
+ ret = createStringObjectFromLongLong(llval);
+ }
+ return ret;
+}
+
+
+/* Return random element from set. The returned object will always have
+ * an incremented refcount. */
+robj *setTypeRandomElement(robj *subject) {
+ robj *ret = NULL;
+ if (subject->encoding == REDIS_ENCODING_HT) {
+ dictEntry *de = dictGetRandomKey(subject->ptr);
+ ret = dictGetEntryKey(de);
+ incrRefCount(ret);
+ } else if (subject->encoding == REDIS_ENCODING_INTSET) {
+ long long llval = intsetRandom(subject->ptr);
+ ret = createStringObjectFromLongLong(llval);
+ } else {
+ redisPanic("Unknown set encoding");
+ }
+ return ret;
+}
+
+unsigned long setTypeSize(robj *subject) {
+ if (subject->encoding == REDIS_ENCODING_HT) {
+ return dictSize((dict*)subject->ptr);
+ } else if (subject->encoding == REDIS_ENCODING_INTSET) {
+ return intsetLen((intset*)subject->ptr);
+ } else {
+ redisPanic("Unknown set encoding");
+ }
+}
+
+/* Convert the set to specified encoding. The resulting dict (when converting
+ * to a hashtable) is presized to hold the number of elements in the original
+ * set. */
+void setTypeConvert(robj *subject, int enc) {
+ setTypeIterator *si;
+ robj *element;
+ redisAssert(subject->type == REDIS_SET);
+
+ if (enc == REDIS_ENCODING_HT) {
+ dict *d = dictCreate(&setDictType,NULL);
+ /* Presize the dict to avoid rehashing */
+ dictExpand(d,intsetLen(subject->ptr));
+
+ /* setTypeGet returns a robj with incremented refcount */
+ si = setTypeInitIterator(subject);
+ while ((element = setTypeNext(si)) != NULL)
+ redisAssert(dictAdd(d,element,NULL) == DICT_OK);
+ setTypeReleaseIterator(si);
+
+ subject->encoding = REDIS_ENCODING_HT;
+ zfree(subject->ptr);
+ subject->ptr = d;
+ } else {
+ redisPanic("Unsupported set conversion");
+ }
+}
+
void saddCommand(redisClient *c) {
robj *set;
set = lookupKeyWrite(c->db,c->argv[1]);
if (set == NULL) {
- set = createSetObject();
+ set = setTypeCreate(c->argv[2]);
dbAdd(c->db,c->argv[1],set);
} else {
if (set->type != REDIS_SET) {
@@ -17,8 +187,7 @@ void saddCommand(redisClient *c) {
return;
}
}
- if (dictAdd(set->ptr,c->argv[2],NULL) == DICT_OK) {
- incrRefCount(c->argv[2]);
+ if (setTypeAdd(set,c->argv[2])) {
touchWatchedKey(c->db,c->argv[1]);
server.dirty++;
addReply(c,shared.cone);
@@ -33,11 +202,10 @@ void sremCommand(redisClient *c) {
if ((set = lookupKeyWriteOrReply(c,c->argv[1],shared.czero)) == NULL ||
checkType(c,set,REDIS_SET)) return;
- if (dictDelete(set->ptr,c->argv[2]) == DICT_OK) {
- server.dirty++;
+ if (setTypeRemove(set,c->argv[2])) {
+ if (setTypeSize(set) == 0) dbDelete(c->db,c->argv[1]);
touchWatchedKey(c->db,c->argv[1]);
- if (htNeedsResize(set->ptr)) dictResize(set->ptr);
- if (dictSize((dict*)set->ptr) == 0) dbDelete(c->db,c->argv[1]);
+ server.dirty++;
addReply(c,shared.cone);
} else {
addReply(c,shared.czero);
@@ -45,40 +213,48 @@ void sremCommand(redisClient *c) {
}
void smoveCommand(redisClient *c) {
- robj *srcset, *dstset;
-
+ robj *srcset, *dstset, *ele;
srcset = lookupKeyWrite(c->db,c->argv[1]);
dstset = lookupKeyWrite(c->db,c->argv[2]);
+ ele = c->argv[3];
- /* If the source key does not exist return 0, if it's of the wrong type
- * raise an error */
- if (srcset == NULL || srcset->type != REDIS_SET) {
- addReply(c, srcset ? shared.wrongtypeerr : shared.czero);
- return;
- }
- /* Error if the destination key is not a set as well */
- if (dstset && dstset->type != REDIS_SET) {
- addReply(c,shared.wrongtypeerr);
- return;
- }
- /* Remove the element from the source set */
- if (dictDelete(srcset->ptr,c->argv[3]) == DICT_ERR) {
- /* Key not found in the src set! return zero */
+ /* If the source key does not exist return 0 */
+ if (srcset == NULL) {
addReply(c,shared.czero);
return;
}
- if (dictSize((dict*)srcset->ptr) == 0 && srcset != dstset)
- dbDelete(c->db,c->argv[1]);
+
+ /* If the source key has the wrong type, or the destination key
+ * is set and has the wrong type, return with an error. */
+ if (checkType(c,srcset,REDIS_SET) ||
+ (dstset && checkType(c,dstset,REDIS_SET))) return;
+
+ /* If srcset and dstset are equal, SMOVE is a no-op */
+ if (srcset == dstset) {
+ addReply(c,shared.cone);
+ return;
+ }
+
+ /* If the element cannot be removed from the src set, return 0. */
+ if (!setTypeRemove(srcset,ele)) {
+ addReply(c,shared.czero);
+ return;
+ }
+
+ /* Remove the src set from the database when empty */
+ if (setTypeSize(srcset) == 0) dbDelete(c->db,c->argv[1]);
touchWatchedKey(c->db,c->argv[1]);
touchWatchedKey(c->db,c->argv[2]);
server.dirty++;
- /* Add the element to the destination set */
+
+ /* Create the destination set when it doesn't exist */
if (!dstset) {
- dstset = createSetObject();
+ dstset = setTypeCreate(ele);
dbAdd(c->db,c->argv[2],dstset);
}
- if (dictAdd(dstset->ptr,c->argv[3],NULL) == DICT_OK)
- incrRefCount(c->argv[3]);
+
+ /* An extra key has changed when ele was successfully added to dstset */
+ if (setTypeAdd(dstset,ele)) server.dirty++;
addReply(c,shared.cone);
}
@@ -88,7 +264,7 @@ void sismemberCommand(redisClient *c) {
if ((set = lookupKeyReadOrReply(c,c->argv[1],shared.czero)) == NULL ||
checkType(c,set,REDIS_SET)) return;
- if (dictFind(set->ptr,c->argv[2]))
+ if (setTypeIsMember(set,c->argv[2]))
addReply(c,shared.cone);
else
addReply(c,shared.czero);
@@ -96,75 +272,64 @@ void sismemberCommand(redisClient *c) {
void scardCommand(redisClient *c) {
robj *o;
- dict *s;
if ((o = lookupKeyReadOrReply(c,c->argv[1],shared.czero)) == NULL ||
checkType(c,o,REDIS_SET)) return;
- s = o->ptr;
- addReplyUlong(c,dictSize(s));
+ addReplyLongLong(c,setTypeSize(o));
}
void spopCommand(redisClient *c) {
- robj *set;
- dictEntry *de;
+ robj *set, *ele;
if ((set = lookupKeyWriteOrReply(c,c->argv[1],shared.nullbulk)) == NULL ||
checkType(c,set,REDIS_SET)) return;
- de = dictGetRandomKey(set->ptr);
- if (de == NULL) {
+ ele = setTypeRandomElement(set);
+ if (ele == NULL) {
addReply(c,shared.nullbulk);
} else {
- robj *ele = dictGetEntryKey(de);
-
+ setTypeRemove(set,ele);
addReplyBulk(c,ele);
- dictDelete(set->ptr,ele);
- if (htNeedsResize(set->ptr)) dictResize(set->ptr);
- if (dictSize((dict*)set->ptr) == 0) dbDelete(c->db,c->argv[1]);
+ decrRefCount(ele);
+ if (setTypeSize(set) == 0) dbDelete(c->db,c->argv[1]);
touchWatchedKey(c->db,c->argv[1]);
server.dirty++;
}
}
void srandmemberCommand(redisClient *c) {
- robj *set;
- dictEntry *de;
+ robj *set, *ele;
if ((set = lookupKeyReadOrReply(c,c->argv[1],shared.nullbulk)) == NULL ||
checkType(c,set,REDIS_SET)) return;
- de = dictGetRandomKey(set->ptr);
- if (de == NULL) {
+ ele = setTypeRandomElement(set);
+ if (ele == NULL) {
addReply(c,shared.nullbulk);
} else {
- robj *ele = dictGetEntryKey(de);
-
addReplyBulk(c,ele);
+ decrRefCount(ele);
}
}
int qsortCompareSetsByCardinality(const void *s1, const void *s2) {
- dict **d1 = (void*) s1, **d2 = (void*) s2;
-
- return dictSize(*d1)-dictSize(*d2);
+ return setTypeSize(*(robj**)s1)-setTypeSize(*(robj**)s2);
}
-void sinterGenericCommand(redisClient *c, robj **setskeys, unsigned long setsnum, robj *dstkey) {
- dict **dv = zmalloc(sizeof(dict*)*setsnum);
- dictIterator *di;
- dictEntry *de;
- robj *lenobj = NULL, *dstset = NULL;
+void sinterGenericCommand(redisClient *c, robj **setkeys, unsigned long setnum, robj *dstkey) {
+ robj **sets = zmalloc(sizeof(robj*)*setnum);
+ setTypeIterator *si;
+ robj *ele, *dstset = NULL;
+ void *replylen = NULL;
unsigned long j, cardinality = 0;
- for (j = 0; j < setsnum; j++) {
- robj *setobj;
-
- setobj = dstkey ?
- lookupKeyWrite(c->db,setskeys[j]) :
- lookupKeyRead(c->db,setskeys[j]);
+ for (j = 0; j < setnum; j++) {
+ robj *setobj = dstkey ?
+ lookupKeyWrite(c->db,setkeys[j]) :
+ lookupKeyRead(c->db,setkeys[j]);
if (!setobj) {
- zfree(dv);
+ zfree(sets);
if (dstkey) {
if (dbDelete(c->db,dstkey)) {
touchWatchedKey(c->db,dstkey);
@@ -176,16 +341,15 @@ void sinterGenericCommand(redisClient *c, robj **setskeys, unsigned long setsnum
}
return;
}
- if (setobj->type != REDIS_SET) {
- zfree(dv);
- addReply(c,shared.wrongtypeerr);
+ if (checkType(c,setobj,REDIS_SET)) {
+ zfree(sets);
return;
}
- dv[j] = setobj->ptr;
+ sets[j] = setobj;
}
/* Sort sets from the smallest to largest, this will improve our
* algorithm's performace */
- qsort(dv,setsnum,sizeof(dict*),qsortCompareSetsByCardinality);
+ qsort(sets,setnum,sizeof(robj*),qsortCompareSetsByCardinality);
/* The first thing we should output is the total number of elements...
* since this is a multi-bulk write, but at this stage we don't know
@@ -193,45 +357,41 @@ void sinterGenericCommand(redisClient *c, robj **setskeys, unsigned long setsnum
* to the output list and save the pointer to later modify it with the
* right length */
if (!dstkey) {
- lenobj = createObject(REDIS_STRING,NULL);
- addReply(c,lenobj);
- decrRefCount(lenobj);
+ replylen = addDeferredMultiBulkLength(c);
} else {
/* If we have a target key where to store the resulting set
* create this key with an empty set inside */
- dstset = createSetObject();
+ dstset = createIntsetObject();
}
/* Iterate all the elements of the first (smallest) set, and test
* the element against all the other sets, if at least one set does
* not include the element it is discarded */
- di = dictGetIterator(dv[0]);
+ si = setTypeInitIterator(sets[0]);
+ while((ele = setTypeNext(si)) != NULL) {
+ for (j = 1; j < setnum; j++)
+ if (!setTypeIsMember(sets[j],ele)) break;
- while((de = dictNext(di)) != NULL) {
- robj *ele;
-
- for (j = 1; j < setsnum; j++)
- if (dictFind(dv[j],dictGetEntryKey(de)) == NULL) break;
- if (j != setsnum)
- continue; /* at least one set does not contain the member */
- ele = dictGetEntryKey(de);
- if (!dstkey) {
- addReplyBulk(c,ele);
- cardinality++;
- } else {
- dictAdd(dstset->ptr,ele,NULL);
- incrRefCount(ele);
+ /* Only take action when all sets contain the member */
+ if (j == setnum) {
+ if (!dstkey) {
+ addReplyBulk(c,ele);
+ cardinality++;
+ } else {
+ setTypeAdd(dstset,ele);
+ }
}
+ decrRefCount(ele);
}
- dictReleaseIterator(di);
+ setTypeReleaseIterator(si);
if (dstkey) {
/* Store the resulting set into the target, if the intersection
* is not an empty set. */
dbDelete(c->db,dstkey);
- if (dictSize((dict*)dstset->ptr) > 0) {
+ if (setTypeSize(dstset) > 0) {
dbAdd(c->db,dstkey,dstset);
- addReplyLongLong(c,dictSize((dict*)dstset->ptr));
+ addReplyLongLong(c,setTypeSize(dstset));
} else {
decrRefCount(dstset);
addReply(c,shared.czero);
@@ -239,9 +399,9 @@ void sinterGenericCommand(redisClient *c, robj **setskeys, unsigned long setsnum
touchWatchedKey(c->db,dstkey);
server.dirty++;
} else {
- lenobj->ptr = sdscatprintf(sdsempty(),"*%lu\r\n",cardinality);
+ setDeferredMultiBulkLength(c,replylen,cardinality);
}
- zfree(dv);
+ zfree(sets);
}
void sinterCommand(redisClient *c) {
@@ -252,85 +412,78 @@ void sinterstoreCommand(redisClient *c) {
sinterGenericCommand(c,c->argv+2,c->argc-2,c->argv[1]);
}
-void sunionDiffGenericCommand(redisClient *c, robj **setskeys, int setsnum, robj *dstkey, int op) {
- dict **dv = zmalloc(sizeof(dict*)*setsnum);
- dictIterator *di;
- dictEntry *de;
- robj *dstset = NULL;
+#define REDIS_OP_UNION 0
+#define REDIS_OP_DIFF 1
+#define REDIS_OP_INTER 2
+
+void sunionDiffGenericCommand(redisClient *c, robj **setkeys, int setnum, robj *dstkey, int op) {
+ robj **sets = zmalloc(sizeof(robj*)*setnum);
+ setTypeIterator *si;
+ robj *ele, *dstset = NULL;
int j, cardinality = 0;
- for (j = 0; j < setsnum; j++) {
- robj *setobj;
-
- setobj = dstkey ?
- lookupKeyWrite(c->db,setskeys[j]) :
- lookupKeyRead(c->db,setskeys[j]);
+ for (j = 0; j < setnum; j++) {
+ robj *setobj = dstkey ?
+ lookupKeyWrite(c->db,setkeys[j]) :
+ lookupKeyRead(c->db,setkeys[j]);
if (!setobj) {
- dv[j] = NULL;
+ sets[j] = NULL;
continue;
}
- if (setobj->type != REDIS_SET) {
- zfree(dv);
- addReply(c,shared.wrongtypeerr);
+ if (checkType(c,setobj,REDIS_SET)) {
+ zfree(sets);
return;
}
- dv[j] = setobj->ptr;
+ sets[j] = setobj;
}
/* We need a temp set object to store our union. If the dstkey
* is not NULL (that is, we are inside an SUNIONSTORE operation) then
* this set object will be the resulting object to set into the target key*/
- dstset = createSetObject();
+ dstset = createIntsetObject();
/* Iterate all the elements of all the sets, add every element a single
* time to the result set */
- for (j = 0; j < setsnum; j++) {
- if (op == REDIS_OP_DIFF && j == 0 && !dv[j]) break; /* result set is empty */
- if (!dv[j]) continue; /* non existing keys are like empty sets */
+ for (j = 0; j < setnum; j++) {
+ if (op == REDIS_OP_DIFF && j == 0 && !sets[j]) break; /* result set is empty */
+ if (!sets[j]) continue; /* non existing keys are like empty sets */
- di = dictGetIterator(dv[j]);
-
- while((de = dictNext(di)) != NULL) {
- robj *ele;
-
- /* dictAdd will not add the same element multiple times */
- ele = dictGetEntryKey(de);
+ si = setTypeInitIterator(sets[j]);
+ while((ele = setTypeNext(si)) != NULL) {
if (op == REDIS_OP_UNION || j == 0) {
- if (dictAdd(dstset->ptr,ele,NULL) == DICT_OK) {
- incrRefCount(ele);
+ if (setTypeAdd(dstset,ele)) {
cardinality++;
}
} else if (op == REDIS_OP_DIFF) {
- if (dictDelete(dstset->ptr,ele) == DICT_OK) {
+ if (setTypeRemove(dstset,ele)) {
cardinality--;
}
}
+ decrRefCount(ele);
}
- dictReleaseIterator(di);
+ setTypeReleaseIterator(si);
- /* result set is empty? Exit asap. */
+ /* Exit when result set is empty. */
if (op == REDIS_OP_DIFF && cardinality == 0) break;
}
/* Output the content of the resulting set, if not in STORE mode */
if (!dstkey) {
- addReplySds(c,sdscatprintf(sdsempty(),"*%d\r\n",cardinality));
- di = dictGetIterator(dstset->ptr);
- while((de = dictNext(di)) != NULL) {
- robj *ele;
-
- ele = dictGetEntryKey(de);
+ addReplyMultiBulkLen(c,cardinality);
+ si = setTypeInitIterator(dstset);
+ while((ele = setTypeNext(si)) != NULL) {
addReplyBulk(c,ele);
+ decrRefCount(ele);
}
- dictReleaseIterator(di);
+ setTypeReleaseIterator(si);
decrRefCount(dstset);
} else {
/* If we have a target key where to store the resulting set
* create this key with the result set inside */
dbDelete(c->db,dstkey);
- if (dictSize((dict*)dstset->ptr) > 0) {
+ if (setTypeSize(dstset) > 0) {
dbAdd(c->db,dstkey,dstset);
- addReplyLongLong(c,dictSize((dict*)dstset->ptr));
+ addReplyLongLong(c,setTypeSize(dstset));
} else {
decrRefCount(dstset);
addReply(c,shared.czero);
@@ -338,7 +491,7 @@ void sunionDiffGenericCommand(redisClient *c, robj **setskeys, int setsnum, robj
touchWatchedKey(c->db,dstkey);
server.dirty++;
}
- zfree(dv);
+ zfree(sets);
}
void sunionCommand(redisClient *c) {
diff --git a/src/t_string.c b/src/t_string.c
index f55595c2..509c630a 100644
--- a/src/t_string.c
+++ b/src/t_string.c
@@ -12,12 +12,11 @@ void setGenericCommand(redisClient *c, int nx, robj *key, robj *val, robj *expir
if (getLongFromObjectOrReply(c, expire, &seconds, NULL) != REDIS_OK)
return;
if (seconds <= 0) {
- addReplySds(c,sdsnew("-ERR invalid expire time in SETEX\r\n"));
+ addReplyError(c,"invalid expire time in SETEX");
return;
}
}
- if (nx) deleteIfVolatile(c->db,key);
retval = dbAdd(c->db,key,val);
if (retval == REDIS_ERR) {
if (!nx) {
@@ -80,7 +79,7 @@ void getsetCommand(redisClient *c) {
void mgetCommand(redisClient *c) {
int j;
- addReplySds(c,sdscatprintf(sdsempty(),"*%d\r\n",c->argc-1));
+ addReplyMultiBulkLen(c,c->argc-1);
for (j = 1; j < c->argc; j++) {
robj *o = lookupKeyRead(c->db,c->argv[j]);
if (o == NULL) {
@@ -99,7 +98,7 @@ void msetGenericCommand(redisClient *c, int nx) {
int j, busykeys = 0;
if ((c->argc % 2) == 0) {
- addReplySds(c,sdsnew("-ERR wrong number of arguments for MSET\r\n"));
+ addReplyError(c,"wrong number of arguments for MSET");
return;
}
/* Handle the NX flag. The MSETNX semantic is to return zero and don't
@@ -212,7 +211,7 @@ void appendCommand(redisClient *c) {
}
touchWatchedKey(c->db,c->argv[1]);
server.dirty++;
- addReplySds(c,sdscatprintf(sdsempty(),":%lu\r\n",(unsigned long)totlen));
+ addReplyLongLong(c,totlen);
}
void substrCommand(redisClient *c) {
diff --git a/src/t_zset.c b/src/t_zset.c
index e93e5c40..114c95d6 100644
--- a/src/t_zset.c
+++ b/src/t_zset.c
@@ -24,13 +24,7 @@
* from tail to head, useful for ZREVRANGE. */
zskiplistNode *zslCreateNode(int level, double score, robj *obj) {
- zskiplistNode *zn = zmalloc(sizeof(*zn));
-
- zn->forward = zmalloc(sizeof(zskiplistNode*) * level);
- if (level > 1)
- zn->span = zmalloc(sizeof(unsigned int) * (level - 1));
- else
- zn->span = NULL;
+ zskiplistNode *zn = zmalloc(sizeof(*zn)+level*sizeof(struct zskiplistLevel));
zn->score = score;
zn->obj = obj;
return zn;
@@ -45,11 +39,8 @@ zskiplist *zslCreate(void) {
zsl->length = 0;
zsl->header = zslCreateNode(ZSKIPLIST_MAXLEVEL,0,NULL);
for (j = 0; j < ZSKIPLIST_MAXLEVEL; j++) {
- zsl->header->forward[j] = NULL;
-
- /* span has space for ZSKIPLIST_MAXLEVEL-1 elements */
- if (j < ZSKIPLIST_MAXLEVEL-1)
- zsl->header->span[j] = 0;
+ zsl->header->level[j].forward = NULL;
+ zsl->header->level[j].span = 0;
}
zsl->header->backward = NULL;
zsl->tail = NULL;
@@ -58,19 +49,15 @@ zskiplist *zslCreate(void) {
void zslFreeNode(zskiplistNode *node) {
decrRefCount(node->obj);
- zfree(node->forward);
- zfree(node->span);
zfree(node);
}
void zslFree(zskiplist *zsl) {
- zskiplistNode *node = zsl->header->forward[0], *next;
+ zskiplistNode *node = zsl->header->level[0].forward, *next;
- zfree(zsl->header->forward);
- zfree(zsl->header->span);
zfree(zsl->header);
while(node) {
- next = node->forward[0];
+ next = node->level[0].forward;
zslFreeNode(node);
node = next;
}
@@ -84,7 +71,7 @@ int zslRandomLevel(void) {
return (levellevel-1; i >= 0; i--) {
/* store rank that is crossed to reach the insert position */
rank[i] = i == (zsl->level-1) ? 0 : rank[i+1];
-
- while (x->forward[i] &&
- (x->forward[i]->score < score ||
- (x->forward[i]->score == score &&
- compareStringObjects(x->forward[i]->obj,obj) < 0))) {
- rank[i] += i > 0 ? x->span[i-1] : 1;
- x = x->forward[i];
+ while (x->level[i].forward &&
+ (x->level[i].forward->score < score ||
+ (x->level[i].forward->score == score &&
+ compareStringObjects(x->level[i].forward->obj,obj) < 0))) {
+ rank[i] += x->level[i].span;
+ x = x->level[i].forward;
}
update[i] = x;
}
@@ -112,56 +98,51 @@ void zslInsert(zskiplist *zsl, double score, robj *obj) {
for (i = zsl->level; i < level; i++) {
rank[i] = 0;
update[i] = zsl->header;
- update[i]->span[i-1] = zsl->length;
+ update[i]->level[i].span = zsl->length;
}
zsl->level = level;
}
x = zslCreateNode(level,score,obj);
for (i = 0; i < level; i++) {
- x->forward[i] = update[i]->forward[i];
- update[i]->forward[i] = x;
+ x->level[i].forward = update[i]->level[i].forward;
+ update[i]->level[i].forward = x;
/* update span covered by update[i] as x is inserted here */
- if (i > 0) {
- x->span[i-1] = update[i]->span[i-1] - (rank[0] - rank[i]);
- update[i]->span[i-1] = (rank[0] - rank[i]) + 1;
- }
+ x->level[i].span = update[i]->level[i].span - (rank[0] - rank[i]);
+ update[i]->level[i].span = (rank[0] - rank[i]) + 1;
}
/* increment span for untouched levels */
for (i = level; i < zsl->level; i++) {
- update[i]->span[i-1]++;
+ update[i]->level[i].span++;
}
x->backward = (update[0] == zsl->header) ? NULL : update[0];
- if (x->forward[0])
- x->forward[0]->backward = x;
+ if (x->level[0].forward)
+ x->level[0].forward->backward = x;
else
zsl->tail = x;
zsl->length++;
+ return x;
}
/* Internal function used by zslDelete, zslDeleteByScore and zslDeleteByRank */
void zslDeleteNode(zskiplist *zsl, zskiplistNode *x, zskiplistNode **update) {
int i;
for (i = 0; i < zsl->level; i++) {
- if (update[i]->forward[i] == x) {
- if (i > 0) {
- update[i]->span[i-1] += x->span[i-1] - 1;
- }
- update[i]->forward[i] = x->forward[i];
+ if (update[i]->level[i].forward == x) {
+ update[i]->level[i].span += x->level[i].span - 1;
+ update[i]->level[i].forward = x->level[i].forward;
} else {
- /* invariant: i > 0, because update[0]->forward[0]
- * is always equal to x */
- update[i]->span[i-1] -= 1;
+ update[i]->level[i].span -= 1;
}
}
- if (x->forward[0]) {
- x->forward[0]->backward = x->backward;
+ if (x->level[0].forward) {
+ x->level[0].forward->backward = x->backward;
} else {
zsl->tail = x->backward;
}
- while(zsl->level > 1 && zsl->header->forward[zsl->level-1] == NULL)
+ while(zsl->level > 1 && zsl->header->level[zsl->level-1].forward == NULL)
zsl->level--;
zsl->length--;
}
@@ -173,16 +154,16 @@ int zslDelete(zskiplist *zsl, double score, robj *obj) {
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
- while (x->forward[i] &&
- (x->forward[i]->score < score ||
- (x->forward[i]->score == score &&
- compareStringObjects(x->forward[i]->obj,obj) < 0)))
- x = x->forward[i];
+ while (x->level[i].forward &&
+ (x->level[i].forward->score < score ||
+ (x->level[i].forward->score == score &&
+ compareStringObjects(x->level[i].forward->obj,obj) < 0)))
+ x = x->level[i].forward;
update[i] = x;
}
/* We may have multiple elements with the same score, what we need
* is to find the element with both the right score and object. */
- x = x->forward[0];
+ x = x->level[0].forward;
if (x && score == x->score && equalStringObjects(x->obj,obj)) {
zslDeleteNode(zsl, x, update);
zslFreeNode(x);
@@ -204,16 +185,16 @@ unsigned long zslDeleteRangeByScore(zskiplist *zsl, double min, double max, dict
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
- while (x->forward[i] && x->forward[i]->score < min)
- x = x->forward[i];
+ while (x->level[i].forward && x->level[i].forward->score < min)
+ x = x->level[i].forward;
update[i] = x;
}
/* We may have multiple elements with the same score, what we need
* is to find the element with both the right score and object. */
- x = x->forward[0];
+ x = x->level[0].forward;
while (x && x->score <= max) {
- zskiplistNode *next = x->forward[0];
- zslDeleteNode(zsl, x, update);
+ zskiplistNode *next = x->level[0].forward;
+ zslDeleteNode(zsl,x,update);
dictDelete(dict,x->obj);
zslFreeNode(x);
removed++;
@@ -231,18 +212,18 @@ unsigned long zslDeleteRangeByRank(zskiplist *zsl, unsigned int start, unsigned
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
- while (x->forward[i] && (traversed + (i > 0 ? x->span[i-1] : 1)) < start) {
- traversed += i > 0 ? x->span[i-1] : 1;
- x = x->forward[i];
+ while (x->level[i].forward && (traversed + x->level[i].span) < start) {
+ traversed += x->level[i].span;
+ x = x->level[i].forward;
}
update[i] = x;
}
traversed++;
- x = x->forward[0];
+ x = x->level[0].forward;
while (x && traversed <= end) {
- zskiplistNode *next = x->forward[0];
- zslDeleteNode(zsl, x, update);
+ zskiplistNode *next = x->level[0].forward;
+ zslDeleteNode(zsl,x,update);
dictDelete(dict,x->obj);
zslFreeNode(x);
removed++;
@@ -260,12 +241,12 @@ zskiplistNode *zslFirstWithScore(zskiplist *zsl, double score) {
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
- while (x->forward[i] && x->forward[i]->score < score)
- x = x->forward[i];
+ while (x->level[i].forward && x->level[i].forward->score < score)
+ x = x->level[i].forward;
}
/* We may have multiple elements with the same score, what we need
* is to find the element with both the right score and object. */
- return x->forward[0];
+ return x->level[0].forward;
}
/* Find the rank for an element by both score and key.
@@ -279,12 +260,12 @@ unsigned long zslistTypeGetRank(zskiplist *zsl, double score, robj *o) {
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
- while (x->forward[i] &&
- (x->forward[i]->score < score ||
- (x->forward[i]->score == score &&
- compareStringObjects(x->forward[i]->obj,o) <= 0))) {
- rank += i > 0 ? x->span[i-1] : 1;
- x = x->forward[i];
+ while (x->level[i].forward &&
+ (x->level[i].forward->score < score ||
+ (x->level[i].forward->score == score &&
+ compareStringObjects(x->level[i].forward->obj,o) <= 0))) {
+ rank += x->level[i].span;
+ x = x->level[i].forward;
}
/* x might be equal to zsl->header, so test if obj is non-NULL */
@@ -303,10 +284,10 @@ zskiplistNode* zslistTypeGetElementByRank(zskiplist *zsl, unsigned long rank) {
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
- while (x->forward[i] && (traversed + (i>0 ? x->span[i-1] : 1)) <= rank)
+ while (x->level[i].forward && (traversed + x->level[i].span) <= rank)
{
- traversed += i > 0 ? x->span[i-1] : 1;
- x = x->forward[i];
+ traversed += x->level[i].span;
+ x = x->level[i].forward;
}
if (traversed == rank) {
return x;
@@ -319,13 +300,11 @@ zskiplistNode* zslistTypeGetElementByRank(zskiplist *zsl, unsigned long rank) {
* Sorted set commands
*----------------------------------------------------------------------------*/
-/* This generic command implements both ZADD and ZINCRBY.
- * scoreval is the score if the operation is a ZADD (doincrement == 0) or
- * the increment if the operation is a ZINCRBY (doincrement == 1). */
-void zaddGenericCommand(redisClient *c, robj *key, robj *ele, double scoreval, int doincrement) {
+/* This generic command implements both ZADD and ZINCRBY. */
+void zaddGenericCommand(redisClient *c, robj *key, robj *ele, double score, int incr) {
robj *zsetobj;
zset *zs;
- double *score;
+ zskiplistNode *znode;
zsetobj = lookupKeyWrite(c->db,key);
if (zsetobj == NULL) {
@@ -339,72 +318,72 @@ void zaddGenericCommand(redisClient *c, robj *key, robj *ele, double scoreval, i
}
zs = zsetobj->ptr;
- /* Ok now since we implement both ZADD and ZINCRBY here the code
- * needs to handle the two different conditions. It's all about setting
- * '*score', that is, the new score to set, to the right value. */
- score = zmalloc(sizeof(double));
- if (doincrement) {
- dictEntry *de;
-
+ /* Since both ZADD and ZINCRBY are implemented here, we need to increment
+ * the score first by the current score if ZINCRBY is called. */
+ if (incr) {
/* Read the old score. If the element was not present starts from 0 */
- de = dictFind(zs->dict,ele);
- if (de) {
- double *oldscore = dictGetEntryVal(de);
- *score = *oldscore + scoreval;
- } else {
- *score = scoreval;
- }
- if (isnan(*score)) {
- addReplySds(c,
- sdsnew("-ERR resulting score is not a number (NaN)\r\n"));
- zfree(score);
+ dictEntry *de = dictFind(zs->dict,ele);
+ if (de != NULL)
+ score += *(double*)dictGetEntryVal(de);
+
+ if (isnan(score)) {
+ addReplyError(c,"resulting score is not a number (NaN)");
/* Note that we don't need to check if the zset may be empty and
* should be removed here, as we can only obtain Nan as score if
* there was already an element in the sorted set. */
return;
}
- } else {
- *score = scoreval;
}
- /* What follows is a simple remove and re-insert operation that is common
- * to both ZADD and ZINCRBY... */
- if (dictAdd(zs->dict,ele,score) == DICT_OK) {
- /* case 1: New element */
+ /* We need to remove and re-insert the element when it was already present
+ * in the dictionary, to update the skiplist. Note that we delay adding a
+ * pointer to the score because we want to reference the score in the
+ * skiplist node. */
+ if (dictAdd(zs->dict,ele,NULL) == DICT_OK) {
+ dictEntry *de;
+
+ /* New element */
incrRefCount(ele); /* added to hash */
- zslInsert(zs->zsl,*score,ele);
+ znode = zslInsert(zs->zsl,score,ele);
incrRefCount(ele); /* added to skiplist */
+
+ /* Update the score in the dict entry */
+ de = dictFind(zs->dict,ele);
+ redisAssert(de != NULL);
+ dictGetEntryVal(de) = &znode->score;
touchWatchedKey(c->db,c->argv[1]);
server.dirty++;
- if (doincrement)
- addReplyDouble(c,*score);
+ if (incr)
+ addReplyDouble(c,score);
else
addReply(c,shared.cone);
} else {
dictEntry *de;
- double *oldscore;
+ robj *curobj;
+ double *curscore;
+ int deleted;
- /* case 2: Score update operation */
+ /* Update score */
de = dictFind(zs->dict,ele);
redisAssert(de != NULL);
- oldscore = dictGetEntryVal(de);
- if (*score != *oldscore) {
- int deleted;
+ curobj = dictGetEntryKey(de);
+ curscore = dictGetEntryVal(de);
- /* Remove and insert the element in the skip list with new score */
- deleted = zslDelete(zs->zsl,*oldscore,ele);
+ /* When the score is updated, reuse the existing string object to
+ * prevent extra alloc/dealloc of strings on ZINCRBY. */
+ if (score != *curscore) {
+ deleted = zslDelete(zs->zsl,*curscore,curobj);
redisAssert(deleted != 0);
- zslInsert(zs->zsl,*score,ele);
- incrRefCount(ele);
- /* Update the score in the hash table */
- dictReplace(zs->dict,ele,score);
+ znode = zslInsert(zs->zsl,score,curobj);
+ incrRefCount(curobj);
+
+ /* Update the score in the current dict entry */
+ dictGetEntryVal(de) = &znode->score;
touchWatchedKey(c->db,c->argv[1]);
server.dirty++;
- } else {
- zfree(score);
}
- if (doincrement)
- addReplyDouble(c,*score);
+ if (incr)
+ addReplyDouble(c,score);
else
addReply(c,shared.czero);
}
@@ -426,7 +405,7 @@ void zremCommand(redisClient *c) {
robj *zsetobj;
zset *zs;
dictEntry *de;
- double *oldscore;
+ double curscore;
int deleted;
if ((zsetobj = lookupKeyWriteOrReply(c,c->argv[1],shared.czero)) == NULL ||
@@ -439,8 +418,8 @@ void zremCommand(redisClient *c) {
return;
}
/* Delete from the skiplist */
- oldscore = dictGetEntryVal(de);
- deleted = zslDelete(zs->zsl,*oldscore,c->argv[2]);
+ curscore = *(double*)dictGetEntryVal(de);
+ deleted = zslDelete(zs->zsl,curscore,c->argv[2]);
redisAssert(deleted != 0);
/* Delete from the hash table */
@@ -554,6 +533,7 @@ void zunionInterGenericCommand(redisClient *c, robj *dstkey, int op) {
zsetopsrc *src;
robj *dstobj;
zset *dstzset;
+ zskiplistNode *znode;
dictIterator *di;
dictEntry *de;
int touched = 0;
@@ -561,7 +541,8 @@ void zunionInterGenericCommand(redisClient *c, robj *dstkey, int op) {
/* expect setnum input keys to be given */
setnum = atoi(c->argv[2]->ptr);
if (setnum < 1) {
- addReplySds(c,sdsnew("-ERR at least 1 input key is needed for ZUNIONSTORE/ZINTERSTORE\r\n"));
+ addReplyError(c,
+ "at least 1 input key is needed for ZUNIONSTORE/ZINTERSTORE");
return;
}
@@ -644,28 +625,26 @@ void zunionInterGenericCommand(redisClient *c, robj *dstkey, int op) {
* from small to large, all src[i > 0].dict are non-empty too */
di = dictGetIterator(src[0].dict);
while((de = dictNext(di)) != NULL) {
- double *score = zmalloc(sizeof(double)), value;
- *score = src[0].weight * zunionInterDictValue(de);
+ double score, value;
+ score = src[0].weight * zunionInterDictValue(de);
for (j = 1; j < setnum; j++) {
dictEntry *other = dictFind(src[j].dict,dictGetEntryKey(de));
if (other) {
value = src[j].weight * zunionInterDictValue(other);
- zunionInterAggregate(score, value, aggregate);
+ zunionInterAggregate(&score, value, aggregate);
} else {
break;
}
}
- /* skip entry when not present in every source dict */
- if (j != setnum) {
- zfree(score);
- } else {
+ /* accept entry only when present in every source dict */
+ if (j == setnum) {
robj *o = dictGetEntryKey(de);
- dictAdd(dstzset->dict,o,score);
- incrRefCount(o); /* added to dictionary */
- zslInsert(dstzset->zsl,*score,o);
+ znode = zslInsert(dstzset->zsl,score,o);
incrRefCount(o); /* added to skiplist */
+ dictAdd(dstzset->dict,o,&znode->score);
+ incrRefCount(o); /* added to dictionary */
}
}
dictReleaseIterator(di);
@@ -676,11 +655,12 @@ void zunionInterGenericCommand(redisClient *c, robj *dstkey, int op) {
di = dictGetIterator(src[i].dict);
while((de = dictNext(di)) != NULL) {
- /* skip key when already processed */
- if (dictFind(dstzset->dict,dictGetEntryKey(de)) != NULL) continue;
+ double score, value;
- double *score = zmalloc(sizeof(double)), value;
- *score = src[i].weight * zunionInterDictValue(de);
+ /* skip key when already processed */
+ if (dictFind(dstzset->dict,dictGetEntryKey(de)) != NULL)
+ continue;
+ score = src[i].weight * zunionInterDictValue(de);
/* because the zsets are sorted by size, its only possible
* for sets at larger indices to hold this entry */
@@ -688,15 +668,15 @@ void zunionInterGenericCommand(redisClient *c, robj *dstkey, int op) {
dictEntry *other = dictFind(src[j].dict,dictGetEntryKey(de));
if (other) {
value = src[j].weight * zunionInterDictValue(other);
- zunionInterAggregate(score, value, aggregate);
+ zunionInterAggregate(&score, value, aggregate);
}
}
robj *o = dictGetEntryKey(de);
- dictAdd(dstzset->dict,o,score);
- incrRefCount(o); /* added to dictionary */
- zslInsert(dstzset->zsl,*score,o);
+ znode = zslInsert(dstzset->zsl,score,o);
incrRefCount(o); /* added to skiplist */
+ dictAdd(dstzset->dict,o,&znode->score);
+ incrRefCount(o); /* added to dictionary */
}
dictReleaseIterator(di);
}
@@ -778,18 +758,17 @@ void zrangeGenericCommand(redisClient *c, int reverse) {
ln = start == 0 ? zsl->tail : zslistTypeGetElementByRank(zsl, llen-start);
} else {
ln = start == 0 ?
- zsl->header->forward[0] : zslistTypeGetElementByRank(zsl, start+1);
+ zsl->header->level[0].forward : zslistTypeGetElementByRank(zsl, start+1);
}
/* Return the result in form of a multi-bulk reply */
- addReplySds(c,sdscatprintf(sdsempty(),"*%d\r\n",
- withscores ? (rangelen*2) : rangelen));
+ addReplyMultiBulkLen(c,withscores ? (rangelen*2) : rangelen);
for (j = 0; j < rangelen; j++) {
ele = ln->obj;
addReplyBulk(c,ele);
if (withscores)
addReplyDouble(c,ln->score);
- ln = reverse ? ln->backward : ln->forward[0];
+ ln = reverse ? ln->backward : ln->level[0].forward;
}
}
@@ -840,8 +819,7 @@ void genericZrangebyscoreCommand(redisClient *c, int justcount) {
if (c->argc != (4 + withscores) && c->argc != (7 + withscores))
badsyntax = 1;
if (badsyntax) {
- addReplySds(c,
- sdsnew("-ERR wrong number of arguments for ZRANGEBYSCORE\r\n"));
+ addReplyError(c,"wrong number of arguments for ZRANGEBYSCORE");
return;
}
@@ -866,13 +844,14 @@ void genericZrangebyscoreCommand(redisClient *c, int justcount) {
zset *zsetobj = o->ptr;
zskiplist *zsl = zsetobj->zsl;
zskiplistNode *ln;
- robj *ele, *lenobj = NULL;
+ robj *ele;
+ void *replylen = NULL;
unsigned long rangelen = 0;
/* Get the first node with the score >= min, or with
* score > min if 'minex' is true. */
ln = zslFirstWithScore(zsl,min);
- while (minex && ln && ln->score == min) ln = ln->forward[0];
+ while (minex && ln && ln->score == min) ln = ln->level[0].forward;
if (ln == NULL) {
/* No element matching the speciifed interval */
@@ -884,16 +863,13 @@ void genericZrangebyscoreCommand(redisClient *c, int justcount) {
* are in the list, so we push this object that will represent
* the multi-bulk length in the output buffer, and will "fix"
* it later */
- if (!justcount) {
- lenobj = createObject(REDIS_STRING,NULL);
- addReply(c,lenobj);
- decrRefCount(lenobj);
- }
+ if (!justcount)
+ replylen = addDeferredMultiBulkLength(c);
while(ln && (maxex ? (ln->score < max) : (ln->score <= max))) {
if (offset) {
offset--;
- ln = ln->forward[0];
+ ln = ln->level[0].forward;
continue;
}
if (limit == 0) break;
@@ -903,14 +879,14 @@ void genericZrangebyscoreCommand(redisClient *c, int justcount) {
if (withscores)
addReplyDouble(c,ln->score);
}
- ln = ln->forward[0];
+ ln = ln->level[0].forward;
rangelen++;
if (limit > 0) limit--;
}
if (justcount) {
addReplyLongLong(c,(long)rangelen);
} else {
- lenobj->ptr = sdscatprintf(sdsempty(),"*%lu\r\n",
+ setDeferredMultiBulkLength(c,replylen,
withscores ? (rangelen*2) : rangelen);
}
}
@@ -933,7 +909,7 @@ void zcardCommand(redisClient *c) {
checkType(c,o,REDIS_ZSET)) return;
zs = o->ptr;
- addReplyUlong(c,zs->zsl->length);
+ addReplyLongLong(c,zs->zsl->length);
}
void zscoreCommand(redisClient *c) {
diff --git a/src/testhelp.h b/src/testhelp.h
new file mode 100644
index 00000000..d699f2ae
--- /dev/null
+++ b/src/testhelp.h
@@ -0,0 +1,54 @@
+/* This is a really minimal testing framework for C.
+ *
+ * Example:
+ *
+ * test_cond("Check if 1 == 1", 1==1)
+ * test_cond("Check if 5 > 10", 5 > 10)
+ * test_report()
+ *
+ * Copyright (c) 2010, Salvatore Sanfilippo
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * * Redistributions of source code must retain the above copyright notice,
+ * this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * * Neither the name of Redis nor the names of its contributors may be used
+ * to endorse or promote products derived from this software without
+ * specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __TESTHELP_H
+#define __TESTHELP_H
+
+int __failed_tests = 0;
+int __test_num = 0;
+#define test_cond(descr,_c) do { \
+ __test_num++; printf("%d - %s: ", __test_num, descr); \
+ if(_c) printf("PASSED\n"); else {printf("FAILED\n"); __failed_tests++;} \
+} while(0);
+#define test_report() do { \
+ printf("%d tests, %d passed, %d failed\n", __test_num, \
+ __test_num-__failed_tests, __failed_tests); \
+ if (__failed_tests) { \
+ printf("=== WARNING === We have failed tests here...\n"); \
+ } \
+} while(0);
+
+#endif
diff --git a/src/util.c b/src/util.c
index cc2794f6..e304ff83 100644
--- a/src/util.c
+++ b/src/util.c
@@ -200,24 +200,44 @@ int ll2string(char *s, size_t len, long long value) {
return l;
}
-/* Check if the nul-terminated string 's' can be represented by a long
+/* Check if the sds string 's' can be represented by a long long
* (that is, is a number that fits into long without any other space or
- * character before or after the digits).
+ * character before or after the digits, so that converting this number
+ * back to a string will result in the same bytes as the original string).
*
- * If so, the function returns REDIS_OK and *longval is set to the value
+ * If so, the function returns REDIS_OK and *llongval is set to the value
* of the number. Otherwise REDIS_ERR is returned */
-int isStringRepresentableAsLong(sds s, long *longval) {
+int isStringRepresentableAsLongLong(sds s, long long *llongval) {
char buf[32], *endptr;
- long value;
+ long long value;
int slen;
- value = strtol(s, &endptr, 10);
+ value = strtoll(s, &endptr, 10);
if (endptr[0] != '\0') return REDIS_ERR;
slen = ll2string(buf,32,value);
/* If the number converted back into a string is not identical
* then it's not possible to encode the string as integer */
if (sdslen(s) != (unsigned)slen || memcmp(buf,s,slen)) return REDIS_ERR;
- if (longval) *longval = value;
+ if (llongval) *llongval = value;
return REDIS_OK;
}
+
+int isStringRepresentableAsLong(sds s, long *longval) {
+ long long ll;
+
+ if (isStringRepresentableAsLongLong(s,&ll) == REDIS_ERR) return REDIS_ERR;
+ if (ll < LONG_MIN || ll > LONG_MAX) return REDIS_ERR;
+ *longval = (long)ll;
+ return REDIS_OK;
+}
+
+int isObjectRepresentableAsLongLong(robj *o, long long *llongval) {
+ redisAssert(o->type == REDIS_STRING);
+ if (o->encoding == REDIS_ENCODING_INT) {
+ if (llongval) *llongval = (long) o->ptr;
+ return REDIS_OK;
+ } else {
+ return isStringRepresentableAsLongLong(o->ptr,llongval);
+ }
+}
diff --git a/src/version.h b/src/version.h
index cac59721..80decef1 100644
--- a/src/version.h
+++ b/src/version.h
@@ -1 +1 @@
-#define REDIS_VERSION "2.1.2"
+#define REDIS_VERSION "2.1.4"
diff --git a/src/vm.c b/src/vm.c
index 0ccc5fe2..ee831fb9 100644
--- a/src/vm.c
+++ b/src/vm.c
@@ -110,6 +110,11 @@ void vmInit(void) {
/* LZF requires a lot of stack */
pthread_attr_init(&server.io_threads_attr);
pthread_attr_getstacksize(&server.io_threads_attr, &stacksize);
+
+ /* Solaris may report a stacksize of 0, let's set it to 1 otherwise
+ * multiplying it by 2 in the while loop later will not really help ;) */
+ if (!stacksize) stacksize = 1;
+
while (stacksize < REDIS_THREAD_STACK_SIZE) stacksize *= 2;
pthread_attr_setstacksize(&server.io_threads_attr, stacksize);
/* Listen for events in the threaded I/O pipe */
@@ -395,15 +400,20 @@ double computeObjectSwappability(robj *o) {
z = (o->type == REDIS_ZSET);
d = z ? ((zset*)o->ptr)->dict : o->ptr;
- asize = sizeof(dict)+(sizeof(struct dictEntry*)*dictSlots(d));
- if (z) asize += sizeof(zset)-sizeof(dict);
- if (dictSize(d)) {
- de = dictGetRandomKey(d);
- ele = dictGetEntryKey(de);
- elesize = (ele->encoding == REDIS_ENCODING_RAW) ?
- (sizeof(*o)+sdslen(ele->ptr)) : sizeof(*o);
- asize += (sizeof(struct dictEntry)+elesize)*dictSize(d);
- if (z) asize += sizeof(zskiplistNode)*dictSize(d);
+ if (!z && o->encoding == REDIS_ENCODING_INTSET) {
+ intset *is = o->ptr;
+ asize = sizeof(*is)+is->encoding*is->length;
+ } else {
+ asize = sizeof(dict)+(sizeof(struct dictEntry*)*dictSlots(d));
+ if (z) asize += sizeof(zset)-sizeof(dict);
+ if (dictSize(d)) {
+ de = dictGetRandomKey(d);
+ ele = dictGetEntryKey(de);
+ elesize = (ele->encoding == REDIS_ENCODING_RAW) ?
+ (sizeof(*o)+sdslen(ele->ptr)) : sizeof(*o);
+ asize += (sizeof(struct dictEntry)+elesize)*dictSize(d);
+ if (z) asize += sizeof(zskiplistNode)*dictSize(d);
+ }
}
break;
case REDIS_HASH:
@@ -543,7 +553,15 @@ void freeIOJob(iojob *j) {
/* Every time a thread finished a Job, it writes a byte into the write side
* of an unix pipe in order to "awake" the main thread, and this function
- * is called. */
+ * is called.
+ *
+ * Note that this is called both by the event loop, when a I/O thread
+ * sends a byte in the notification pipe, and is also directly called from
+ * waitEmptyIOJobsQueue().
+ *
+ * In the latter case we don't want to swap more, so we use the
+ * "privdata" argument setting it to a not NULL value to signal this
+ * condition. */
void vmThreadedIOCompletedJob(aeEventLoop *el, int fd, void *privdata,
int mask)
{
@@ -553,6 +571,8 @@ void vmThreadedIOCompletedJob(aeEventLoop *el, int fd, void *privdata,
REDIS_NOTUSED(mask);
REDIS_NOTUSED(privdata);
+ if (privdata != NULL) trytoswap = 0; /* check the comments above... */
+
/* For every byte we read in the read side of the pipe, there is one
* I/O job completed to process. */
while((retval = read(fd,buf,1)) == 1) {
@@ -864,7 +884,8 @@ void waitEmptyIOJobsQueue(void) {
io_processed_len = listLength(server.io_processed);
unlockThreadedIO();
if (io_processed_len) {
- vmThreadedIOCompletedJob(NULL,server.io_ready_pipe_read,NULL,0);
+ vmThreadedIOCompletedJob(NULL,server.io_ready_pipe_read,
+ (void*)0xdeadbeef,0);
usleep(1000); /* 1 millisecond */
} else {
usleep(10000); /* 10 milliseconds */
diff --git a/src/ziplist.c b/src/ziplist.c
index 7a3a8b01..4f44bd58 100644
--- a/src/ziplist.c
+++ b/src/ziplist.c
@@ -1,17 +1,63 @@
-/* Memory layout of a ziplist, containing "foo", "bar", "quux":
- * "foo""bar""quux"
+/* The ziplist is a specially encoded dually linked list that is designed
+ * to be very memory efficient. It stores both strings and integer values,
+ * where integers are encoded as actual integers instead of a series of
+ * characters. It allows push and pop operations on either side of the list
+ * in O(1) time. However, because every operation requires a reallocation of
+ * the memory used by the ziplist, the actual complexity is related to the
+ * amount of memory used by the ziplist.
*
- * is an unsigned integer to hold the number of bytes that
- * the ziplist occupies. This is stored to not have to traverse the ziplist
- * to know the new length when pushing.
+ * ----------------------------------------------------------------------------
*
- * is the number of items in the ziplist. When this value is
- * greater than 254, we need to traverse the entire list to know
- * how many items it holds.
+ * ZIPLIST OVERALL LAYOUT:
+ * The general layout of the ziplist is as follows:
+ *
*
- * is the number of bytes occupied by a single entry. When this
- * number is greater than 253, the length will occupy 5 bytes, where
- * the extra bytes contain an unsigned integer to hold the length.
+ * is an unsigned integer to hold the number of bytes that the
+ * ziplist occupies. This value needs to be stored to be able to resize the
+ * entire structure without the need to traverse it first.
+ *
+ * is the offset to the last entry in the list. This allows a pop
+ * operation on the far side of the list without the need for full traversal.
+ *
+ * is the number of entries.When this value is larger than 2**16-2,
+ * we need to traverse the entire list to know how many items it holds.
+ *
+ * is a single byte special value, equal to 255, which indicates the
+ * end of the list.
+ *
+ * ZIPLIST ENTRIES:
+ * Every entry in the ziplist is prefixed by a header that contains two pieces
+ * of information. First, the length of the previous entry is stored to be
+ * able to traverse the list from back to front. Second, the encoding with an
+ * optional string length of the entry itself is stored.
+ *
+ * The length of the previous entry is encoded in the following way:
+ * If this length is smaller than 254 bytes, it will only consume a single
+ * byte that takes the length as value. When the length is greater than or
+ * equal to 254, it will consume 5 bytes. The first byte is set to 254 to
+ * indicate a larger value is following. The remaining 4 bytes take the
+ * length of the previous entry as value.
+ *
+ * The other header field of the entry itself depends on the contents of the
+ * entry. When the entry is a string, the first 2 bits of this header will hold
+ * the type of encoding used to store the length of the string, followed by the
+ * actual length of the string. When the entry is an integer the first 2 bits
+ * are both set to 1. The following 2 bits are used to specify what kind of
+ * integer will be stored after this header. An overview of the different
+ * types and encodings is as follows:
+ *
+ * |00pppppp| - 1 byte
+ * String value with length less than or equal to 63 bytes (6 bits).
+ * |01pppppp|qqqqqqqq| - 2 bytes
+ * String value with length less than or equal to 16383 bytes (14 bits).
+ * |10______|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
+ * String value with length greater than or equal to 16384 bytes.
+ * |1100____| - 1 byte
+ * Integer encoded as int16_t (2 bytes).
+ * |1101____| - 1 byte
+ * Integer encoded as int32_t (4 bytes).
+ * |1110____| - 1 byte
+ * Integer encoded as int64_t (8 bytes).
*/
#include
@@ -25,25 +71,20 @@
int ll2string(char *s, size_t len, long long value);
-/* Important note: the ZIP_END value is used to depict the end of the
- * ziplist structure. When a pointer contains an entry, the first couple
- * of bytes contain the encoded length of the previous entry. This length
- * is encoded as ZIP_ENC_RAW length, so the first two bits will contain 00
- * and the byte will therefore never have a value of 255. */
#define ZIP_END 255
#define ZIP_BIGLEN 254
-/* Entry encoding */
-#define ZIP_ENC_RAW 0
-#define ZIP_ENC_INT16 1
-#define ZIP_ENC_INT32 2
-#define ZIP_ENC_INT64 3
-#define ZIP_ENCODING(p) ((p)[0] >> 6)
+/* Different encoding/length possibilities */
+#define ZIP_STR_06B (0 << 6)
+#define ZIP_STR_14B (1 << 6)
+#define ZIP_STR_32B (2 << 6)
+#define ZIP_INT_16B (0xc0 | 0<<4)
+#define ZIP_INT_32B (0xc0 | 1<<4)
+#define ZIP_INT_64B (0xc0 | 2<<4)
-/* Length encoding for raw entries */
-#define ZIP_LEN_INLINE 0
-#define ZIP_LEN_UINT16 1
-#define ZIP_LEN_UINT32 2
+/* Macro's to determine type */
+#define ZIP_IS_STR(enc) (((enc) & 0xc0) < 0xc0)
+#define ZIP_IS_INT(enc) (!ZIP_IS_STR(enc) && ((enc) & 0x30) < 0x30)
/* Utility macros */
#define ZIPLIST_BYTES(zl) (*((uint32_t*)(zl)))
@@ -67,14 +108,25 @@ typedef struct zlentry {
unsigned char *p;
} zlentry;
+/* Return the encoding pointer to by 'p'. */
+static unsigned int zipEntryEncoding(unsigned char *p) {
+ /* String encoding: 2 MSBs */
+ unsigned char b = p[0] & 0xc0;
+ if (b < 0xc0) {
+ return b;
+ } else {
+ /* Integer encoding: 4 MSBs */
+ return p[0] & 0xf0;
+ }
+ assert(NULL);
+}
+
/* Return bytes needed to store integer encoded by 'encoding' */
-static unsigned int zipEncodingSize(unsigned char encoding) {
- if (encoding == ZIP_ENC_INT16) {
- return sizeof(int16_t);
- } else if (encoding == ZIP_ENC_INT32) {
- return sizeof(int32_t);
- } else if (encoding == ZIP_ENC_INT64) {
- return sizeof(int64_t);
+static unsigned int zipIntSize(unsigned char encoding) {
+ switch(encoding) {
+ case ZIP_INT_16B: return sizeof(int16_t);
+ case ZIP_INT_32B: return sizeof(int32_t);
+ case ZIP_INT_64B: return sizeof(int64_t);
}
assert(NULL);
}
@@ -82,23 +134,28 @@ static unsigned int zipEncodingSize(unsigned char encoding) {
/* Decode the encoded length pointed by 'p'. If a pointer to 'lensize' is
* provided, it is set to the number of bytes required to encode the length. */
static unsigned int zipDecodeLength(unsigned char *p, unsigned int *lensize) {
- unsigned char encoding = ZIP_ENCODING(p), lenenc;
+ unsigned char encoding = zipEntryEncoding(p);
unsigned int len;
- if (encoding == ZIP_ENC_RAW) {
- lenenc = (p[0] >> 4) & 0x3;
- if (lenenc == ZIP_LEN_INLINE) {
- len = p[0] & 0xf;
+ if (ZIP_IS_STR(encoding)) {
+ switch(encoding) {
+ case ZIP_STR_06B:
+ len = p[0] & 0x3f;
if (lensize) *lensize = 1;
- } else if (lenenc == ZIP_LEN_UINT16) {
- len = p[1] | (p[2] << 8);
- if (lensize) *lensize = 3;
- } else {
- len = p[1] | (p[2] << 8) | (p[3] << 16) | (p[4] << 24);
+ break;
+ case ZIP_STR_14B:
+ len = ((p[0] & 0x3f) << 8) | p[1];
+ if (lensize) *lensize = 2;
+ break;
+ case ZIP_STR_32B:
+ len = (p[1] << 24) | (p[2] << 16) | (p[3] << 8) | p[4];
if (lensize) *lensize = 5;
+ break;
+ default:
+ assert(NULL);
}
} else {
- len = zipEncodingSize(encoding);
+ len = zipIntSize(encoding);
if (lensize) *lensize = 1;
}
return len;
@@ -106,34 +163,36 @@ static unsigned int zipDecodeLength(unsigned char *p, unsigned int *lensize) {
/* Encode the length 'l' writing it in 'p'. If p is NULL it just returns
* the amount of bytes required to encode such a length. */
-static unsigned int zipEncodeLength(unsigned char *p, char encoding, unsigned int rawlen) {
- unsigned char len = 1, lenenc, buf[5];
- if (encoding == ZIP_ENC_RAW) {
- if (rawlen <= 0xf) {
+static unsigned int zipEncodeLength(unsigned char *p, unsigned char encoding, unsigned int rawlen) {
+ unsigned char len = 1, buf[5];
+
+ if (ZIP_IS_STR(encoding)) {
+ /* Although encoding is given it may not be set for strings,
+ * so we determine it here using the raw length. */
+ if (rawlen <= 0x3f) {
if (!p) return len;
- lenenc = ZIP_LEN_INLINE;
- buf[0] = rawlen;
- } else if (rawlen <= 0xffff) {
- len += 2;
+ buf[0] = ZIP_STR_06B | rawlen;
+ } else if (rawlen <= 0x3fff) {
+ len += 1;
if (!p) return len;
- lenenc = ZIP_LEN_UINT16;
- buf[1] = (rawlen ) & 0xff;
- buf[2] = (rawlen >> 8) & 0xff;
+ buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f);
+ buf[1] = rawlen & 0xff;
} else {
len += 4;
if (!p) return len;
- lenenc = ZIP_LEN_UINT32;
- buf[1] = (rawlen ) & 0xff;
- buf[2] = (rawlen >> 8) & 0xff;
- buf[3] = (rawlen >> 16) & 0xff;
- buf[4] = (rawlen >> 24) & 0xff;
+ buf[0] = ZIP_STR_32B;
+ buf[1] = (rawlen >> 24) & 0xff;
+ buf[2] = (rawlen >> 16) & 0xff;
+ buf[3] = (rawlen >> 8) & 0xff;
+ buf[4] = rawlen & 0xff;
}
- buf[0] = (lenenc << 4) | (buf[0] & 0xf);
+ } else {
+ /* Implies integer encoding, so length is always 1. */
+ if (!p) return len;
+ buf[0] = encoding;
}
- if (!p) return len;
- /* Apparently we need to store the length in 'p' */
- buf[0] = (encoding << 6) | (buf[0] & 0x3f);
+ /* Store this length at p */
memcpy(p,buf,len);
return len;
}
@@ -167,6 +226,14 @@ static unsigned int zipPrevEncodeLength(unsigned char *p, unsigned int len) {
}
}
+/* Encode the length of the previous entry and write it to "p". This only
+ * uses the larger encoding (required in __ziplistCascadeUpdate). */
+static void zipPrevEncodeLengthForceLarge(unsigned char *p, unsigned int len) {
+ if (p == NULL) return;
+ p[0] = ZIP_BIGLEN;
+ memcpy(p+1,&len,sizeof(len));
+}
+
/* Return the difference in number of bytes needed to store the new length
* "len" on the entry pointed to by "p". */
static int zipPrevLenByteDiff(unsigned char *p, unsigned int len) {
@@ -198,11 +265,11 @@ static int zipTryEncoding(unsigned char *entry, unsigned int entrylen, long long
/* Great, the string can be encoded. Check what's the smallest
* of our encoding types that can hold this value. */
if (value >= INT16_MIN && value <= INT16_MAX) {
- *encoding = ZIP_ENC_INT16;
+ *encoding = ZIP_INT_16B;
} else if (value >= INT32_MIN && value <= INT32_MAX) {
- *encoding = ZIP_ENC_INT32;
+ *encoding = ZIP_INT_32B;
} else {
- *encoding = ZIP_ENC_INT64;
+ *encoding = ZIP_INT_64B;
}
*v = value;
return 1;
@@ -215,13 +282,13 @@ static void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encodi
int16_t i16;
int32_t i32;
int64_t i64;
- if (encoding == ZIP_ENC_INT16) {
+ if (encoding == ZIP_INT_16B) {
i16 = value;
memcpy(p,&i16,sizeof(i16));
- } else if (encoding == ZIP_ENC_INT32) {
+ } else if (encoding == ZIP_INT_32B) {
i32 = value;
memcpy(p,&i32,sizeof(i32));
- } else if (encoding == ZIP_ENC_INT64) {
+ } else if (encoding == ZIP_INT_64B) {
i64 = value;
memcpy(p,&i64,sizeof(i64));
} else {
@@ -234,13 +301,13 @@ static int64_t zipLoadInteger(unsigned char *p, unsigned char encoding) {
int16_t i16;
int32_t i32;
int64_t i64, ret;
- if (encoding == ZIP_ENC_INT16) {
+ if (encoding == ZIP_INT_16B) {
memcpy(&i16,p,sizeof(i16));
ret = i16;
- } else if (encoding == ZIP_ENC_INT32) {
+ } else if (encoding == ZIP_INT_32B) {
memcpy(&i32,p,sizeof(i32));
ret = i32;
- } else if (encoding == ZIP_ENC_INT64) {
+ } else if (encoding == ZIP_INT_64B) {
memcpy(&i64,p,sizeof(i64));
ret = i64;
} else {
@@ -255,7 +322,7 @@ static zlentry zipEntry(unsigned char *p) {
e.prevrawlen = zipPrevDecodeLength(p,&e.prevrawlensize);
e.len = zipDecodeLength(p+e.prevrawlensize,&e.lensize);
e.headersize = e.prevrawlensize+e.lensize;
- e.encoding = ZIP_ENCODING(p+e.prevrawlensize);
+ e.encoding = zipEntryEncoding(p+e.prevrawlensize);
e.p = p;
return e;
}
@@ -285,11 +352,86 @@ static unsigned char *ziplistResize(unsigned char *zl, unsigned int len) {
return zl;
}
+/* When an entry is inserted, we need to set the prevlen field of the next
+ * entry to equal the length of the inserted entry. It can occur that this
+ * length cannot be encoded in 1 byte and the next entry needs to be grow
+ * a bit larger to hold the 5-byte encoded prevlen. This can be done for free,
+ * because this only happens when an entry is already being inserted (which
+ * causes a realloc and memmove). However, encoding the prevlen may require
+ * that this entry is grown as well. This effect may cascade throughout
+ * the ziplist when there are consecutive entries with a size close to
+ * ZIP_BIGLEN, so we need to check that the prevlen can be encoded in every
+ * consecutive entry.
+ *
+ * Note that this effect can also happen in reverse, where the bytes required
+ * to encode the prevlen field can shrink. This effect is deliberately ignored,
+ * because it can cause a "flapping" effect where a chain prevlen fields is
+ * first grown and then shrunk again after consecutive inserts. Rather, the
+ * field is allowed to stay larger than necessary, because a large prevlen
+ * field implies the ziplist is holding large entries anyway.
+ *
+ * The pointer "p" points to the first entry that does NOT need to be
+ * updated, i.e. consecutive fields MAY need an update. */
+static unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) {
+ unsigned int curlen = ZIPLIST_BYTES(zl), rawlen, rawlensize;
+ unsigned int offset, noffset, extra;
+ unsigned char *np;
+ zlentry cur, next;
+
+ while (p[0] != ZIP_END) {
+ cur = zipEntry(p);
+ rawlen = cur.headersize + cur.len;
+ rawlensize = zipPrevEncodeLength(NULL,rawlen);
+
+ /* Abort if there is no next entry. */
+ if (p[rawlen] == ZIP_END) break;
+ next = zipEntry(p+rawlen);
+
+ /* Abort when "prevlen" has not changed. */
+ if (next.prevrawlen == rawlen) break;
+
+ if (next.prevrawlensize < rawlensize) {
+ /* The "prevlen" field of "next" needs more bytes to hold
+ * the raw length of "cur". */
+ offset = p-zl;
+ extra = rawlensize-next.prevrawlensize;
+ zl = ziplistResize(zl,curlen+extra);
+ ZIPLIST_TAIL_OFFSET(zl) += extra;
+ p = zl+offset;
+
+ /* Move the tail to the back. */
+ np = p+rawlen;
+ noffset = np-zl;
+ memmove(np+rawlensize,
+ np+next.prevrawlensize,
+ curlen-noffset-next.prevrawlensize-1);
+ zipPrevEncodeLength(np,rawlen);
+
+ /* Advance the cursor */
+ p += rawlen;
+ } else {
+ if (next.prevrawlensize > rawlensize) {
+ /* This would result in shrinking, which we want to avoid.
+ * So, set "rawlen" in the available bytes. */
+ zipPrevEncodeLengthForceLarge(p+rawlen,rawlen);
+ } else {
+ zipPrevEncodeLength(p+rawlen,rawlen);
+ }
+
+ /* Stop here, as the raw length of "next" has not changed. */
+ break;
+ }
+ }
+ return zl;
+}
+
/* Delete "num" entries, starting at "p". Returns pointer to the ziplist. */
static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
unsigned int i, totlen, deleted = 0;
- int nextdiff = 0;
- zlentry first = zipEntry(p);
+ int offset, nextdiff = 0;
+ zlentry first, tail;
+
+ first = zipEntry(p);
for (i = 0; p[0] != ZIP_END && i < num; i++) {
p += zipRawEntryLength(p);
deleted++;
@@ -306,7 +448,14 @@ static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsig
zipPrevEncodeLength(p-nextdiff,first.prevrawlen);
/* Update offset for tail */
- ZIPLIST_TAIL_OFFSET(zl) -= totlen+nextdiff;
+ ZIPLIST_TAIL_OFFSET(zl) -= totlen;
+
+ /* When the tail contains more than one entry, we need to take
+ * "nextdiff" in account as well. Otherwise, a change in the
+ * size of prevlen doesn't have an effect on the *tail* offset. */
+ tail = zipEntry(p);
+ if (p[tail.headersize+tail.len] != ZIP_END)
+ ZIPLIST_TAIL_OFFSET(zl) += nextdiff;
/* Move tail to the front of the ziplist */
memmove(first.p,p-nextdiff,ZIPLIST_BYTES(zl)-(p-zl)-1+nextdiff);
@@ -316,8 +465,15 @@ static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsig
}
/* Resize and update length */
+ offset = first.p-zl;
zl = ziplistResize(zl, ZIPLIST_BYTES(zl)-totlen+nextdiff);
ZIPLIST_INCR_LENGTH(zl,-deleted);
+ p = zl+offset;
+
+ /* When nextdiff != 0, the raw length of the next entry has changed, so
+ * we need to cascade the update throughout the ziplist */
+ if (nextdiff != 0)
+ zl = __ziplistCascadeUpdate(zl,p);
}
return zl;
}
@@ -326,29 +482,30 @@ static unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsig
static unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
unsigned int curlen = ZIPLIST_BYTES(zl), reqlen, prevlen = 0;
unsigned int offset, nextdiff = 0;
- unsigned char *tail;
- unsigned char encoding = ZIP_ENC_RAW;
+ unsigned char encoding = 0;
long long value;
- zlentry entry;
+ zlentry entry, tail;
/* Find out prevlen for the entry that is inserted. */
if (p[0] != ZIP_END) {
entry = zipEntry(p);
prevlen = entry.prevrawlen;
} else {
- tail = ZIPLIST_ENTRY_TAIL(zl);
- if (tail[0] != ZIP_END) {
- prevlen = zipRawEntryLength(tail);
+ unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);
+ if (ptail[0] != ZIP_END) {
+ prevlen = zipRawEntryLength(ptail);
}
}
/* See if the entry can be encoded */
if (zipTryEncoding(s,slen,&value,&encoding)) {
- reqlen = zipEncodingSize(encoding);
+ /* 'encoding' is set to the appropriate integer encoding */
+ reqlen = zipIntSize(encoding);
} else {
+ /* 'encoding' is untouched, however zipEncodeLength will use the
+ * string length to figure out how to encode it. */
reqlen = slen;
}
-
/* We need space for both the length of the previous entry and
* the length of the payload. */
reqlen += zipPrevEncodeLength(NULL,prevlen);
@@ -368,22 +525,39 @@ static unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsig
if (p[0] != ZIP_END) {
/* Subtract one because of the ZIP_END bytes */
memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);
+
/* Encode this entry's raw length in the next entry. */
zipPrevEncodeLength(p+reqlen,reqlen);
+
/* Update offset for tail */
- ZIPLIST_TAIL_OFFSET(zl) += reqlen+nextdiff;
+ ZIPLIST_TAIL_OFFSET(zl) += reqlen;
+
+ /* When the tail contains more than one entry, we need to take
+ * "nextdiff" in account as well. Otherwise, a change in the
+ * size of prevlen doesn't have an effect on the *tail* offset. */
+ tail = zipEntry(p+reqlen);
+ if (p[reqlen+tail.headersize+tail.len] != ZIP_END)
+ ZIPLIST_TAIL_OFFSET(zl) += nextdiff;
} else {
/* This element will be the new tail. */
ZIPLIST_TAIL_OFFSET(zl) = p-zl;
}
+ /* When nextdiff != 0, the raw length of the next entry has changed, so
+ * we need to cascade the update throughout the ziplist */
+ if (nextdiff != 0) {
+ offset = p-zl;
+ zl = __ziplistCascadeUpdate(zl,p+reqlen);
+ p = zl+offset;
+ }
+
/* Write the entry */
p += zipPrevEncodeLength(p,prevlen);
p += zipEncodeLength(p,encoding,slen);
- if (encoding != ZIP_ENC_RAW) {
- zipSaveInteger(p,value,encoding);
- } else {
+ if (ZIP_IS_STR(encoding)) {
memcpy(p,s,slen);
+ } else {
+ zipSaveInteger(p,value,encoding);
}
ZIPLIST_INCR_LENGTH(zl,1);
return zl;
@@ -449,6 +623,7 @@ unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p) {
return NULL;
} else {
entry = zipEntry(p);
+ assert(entry.prevrawlen > 0);
return p-entry.prevrawlen;
}
}
@@ -463,7 +638,7 @@ unsigned int ziplistGet(unsigned char *p, unsigned char **sstr, unsigned int *sl
if (sstr) *sstr = NULL;
entry = zipEntry(p);
- if (entry.encoding == ZIP_ENC_RAW) {
+ if (ZIP_IS_STR(entry.encoding)) {
if (sstr) {
*slen = entry.len;
*sstr = p+entry.headersize;
@@ -510,7 +685,7 @@ unsigned int ziplistCompare(unsigned char *p, unsigned char *sstr, unsigned int
if (p[0] == ZIP_END) return 0;
entry = zipEntry(p);
- if (entry.encoding == ZIP_ENC_RAW) {
+ if (ZIP_IS_STR(entry.encoding)) {
/* Raw compare */
if (entry.len == slen) {
return memcmp(p+entry.headersize,sstr,slen) == 0;
@@ -554,21 +729,52 @@ unsigned int ziplistSize(unsigned char *zl) {
void ziplistRepr(unsigned char *zl) {
unsigned char *p;
+ int index = 0;
zlentry entry;
- printf("{total bytes %d} {length %u}\n",ZIPLIST_BYTES(zl), ZIPLIST_LENGTH(zl));
+ printf(
+ "{total bytes %d} "
+ "{length %u}\n"
+ "{tail offset %u}\n",
+ ZIPLIST_BYTES(zl),
+ ZIPLIST_LENGTH(zl),
+ ZIPLIST_TAIL_OFFSET(zl));
p = ZIPLIST_ENTRY_HEAD(zl);
while(*p != ZIP_END) {
entry = zipEntry(p);
- printf("{offset %ld, header %u, payload %u} ",p-zl,entry.headersize,entry.len);
+ printf(
+ "{"
+ "addr 0x%08lx, "
+ "index %2d, "
+ "offset %5ld, "
+ "rl: %5u, "
+ "hs %2u, "
+ "pl: %5u, "
+ "pls: %2u, "
+ "payload %5u"
+ "} ",
+ (long unsigned int)p,
+ index,
+ p-zl,
+ entry.headersize+entry.len,
+ entry.headersize,
+ entry.prevrawlen,
+ entry.prevrawlensize,
+ entry.len);
p += entry.headersize;
- if (entry.encoding == ZIP_ENC_RAW) {
- fwrite(p,entry.len,1,stdout);
+ if (ZIP_IS_STR(entry.encoding)) {
+ if (entry.len > 40) {
+ fwrite(p,40,1,stdout);
+ printf("...");
+ } else {
+ fwrite(p,entry.len,1,stdout);
+ }
} else {
printf("%lld", (long long) zipLoadInteger(p,entry.encoding));
}
printf("\n");
p += entry.len;
+ index++;
}
printf("{end}\n\n");
}
@@ -664,6 +870,10 @@ int main(int argc, char **argv) {
unsigned int elen;
long long value;
+ /* If an argument is given, use it as the random seed. */
+ if (argc == 2)
+ srand(atoi(argv[1]));
+
zl = createIntList();
ziplistRepr(zl);
@@ -915,6 +1125,25 @@ int main(int argc, char **argv) {
ziplistRepr(zl);
}
+ printf("Regression test for >255 byte strings:\n");
+ {
+ char v1[257],v2[257];
+ memset(v1,'x',256);
+ memset(v2,'y',256);
+ zl = ziplistNew();
+ zl = ziplistPush(zl,(unsigned char*)v1,strlen(v1),ZIPLIST_TAIL);
+ zl = ziplistPush(zl,(unsigned char*)v2,strlen(v2),ZIPLIST_TAIL);
+
+ /* Pop values again and compare their value. */
+ p = ziplistIndex(zl,0);
+ assert(ziplistGet(p,&entry,&elen,&value));
+ assert(strncmp(v1,entry,elen) == 0);
+ p = ziplistIndex(zl,1);
+ assert(ziplistGet(p,&entry,&elen,&value));
+ assert(strncmp(v2,entry,elen) == 0);
+ printf("SUCCESS\n\n");
+ }
+
printf("Create long list and check indices:\n");
{
zl = ziplistNew();
@@ -958,7 +1187,57 @@ int main(int argc, char **argv) {
printf("ERROR: \"1025\"\n");
return 1;
}
- printf("SUCCESS\n");
+ printf("SUCCESS\n\n");
+ }
+
+ printf("Stress with random payloads of different encoding:\n");
+ {
+ int i, idx, where, len;
+ long long v;
+ unsigned char *p;
+ char buf[0x4041]; /* max length of generated string */
+ zl = ziplistNew();
+ for (i = 0; i < 100000; i++) {
+ where = (rand() & 1) ? ZIPLIST_HEAD : ZIPLIST_TAIL;
+ if (rand() & 1) {
+ /* equally likely create a 16, 32 or 64 bit int */
+ v = (rand() & INT16_MAX) + ((1ll << 32) >> ((rand() % 3)*16));
+ v *= 2*(rand() & 1)-1; /* randomly flip sign */
+ sprintf(buf, "%lld", v);
+ zl = ziplistPush(zl, (unsigned char*)buf, strlen(buf), where);
+ } else {
+ /* equally likely generate 6, 14 or >14 bit length */
+ v = rand() & 0x3f;
+ v += 0x4000 >> ((rand() % 3)*8);
+ memset(buf, 'x', v);
+ zl = ziplistPush(zl, (unsigned char*)buf, v, where);
+ }
+
+ /* delete a random element */
+ if ((len = ziplistLen(zl)) >= 10) {
+ idx = rand() % len;
+ // printf("Delete index %d\n", idx);
+ // ziplistRepr(zl);
+ ziplistDeleteRange(zl, idx, 1);
+ // ziplistRepr(zl);
+ len--;
+ }
+
+ /* iterate from front to back */
+ idx = 0;
+ p = ziplistIndex(zl, 0);
+ while((p = ziplistNext(zl,p)))
+ idx++;
+ assert(len == idx+1);
+
+ /* iterate from back to front */
+ idx = 0;
+ p = ziplistIndex(zl, -1);
+ while((p = ziplistPrev(zl,p)))
+ idx++;
+ assert(len == idx+1);
+ }
+ printf("SUCCESS\n\n");
}
printf("Stress with variable ziplist size:\n");
diff --git a/src/zmalloc.c b/src/zmalloc.c
index 5c1b5e9a..544155e7 100644
--- a/src/zmalloc.c
+++ b/src/zmalloc.c
@@ -32,6 +32,7 @@
#include
#include
#include
+
#include "config.h"
#if defined(__sun)
@@ -170,3 +171,69 @@ size_t zmalloc_used_memory(void) {
void zmalloc_enable_thread_safeness(void) {
zmalloc_thread_safe = 1;
}
+
+/* Fragmentation = RSS / allocated-bytes */
+
+#if defined(HAVE_PROCFS)
+#include
+#include
+#include
+#include
+
+float zmalloc_get_fragmentation_ratio(void) {
+ size_t allocated = zmalloc_used_memory();
+ int page = sysconf(_SC_PAGESIZE);
+ size_t rss;
+ char buf[4096];
+ char filename[256];
+ int fd, count;
+ char *p, *x;
+
+ snprintf(filename,256,"/proc/%d/stat",getpid());
+ if ((fd = open(filename,O_RDONLY)) == -1) return 0;
+ if (read(fd,buf,4096) <= 0) {
+ close(fd);
+ return 0;
+ }
+ close(fd);
+
+ p = buf;
+ count = 23; /* RSS is the 24th field in /proc//stat */
+ while(p && count--) {
+ p = strchr(p,' ');
+ if (p) p++;
+ }
+ if (!p) return 0;
+ x = strchr(p,' ');
+ if (!x) return 0;
+ *x = '\0';
+
+ rss = strtoll(p,NULL,10);
+ rss *= page;
+ return (float)rss/allocated;
+}
+#elif defined(HAVE_TASKINFO)
+#include
+#include
+#include
+#include
+#include
+#include
+#include
+
+float zmalloc_get_fragmentation_ratio(void) {
+ task_t task = MACH_PORT_NULL;
+ struct task_basic_info t_info;
+ mach_msg_type_number_t t_info_count = TASK_BASIC_INFO_COUNT;
+
+ if (task_for_pid(current_task(), getpid(), &task) != KERN_SUCCESS)
+ return 0;
+ task_info(task, TASK_BASIC_INFO, (task_info_t)&t_info, &t_info_count);
+
+ return (float)t_info.resident_size/zmalloc_used_memory();
+}
+#else
+float zmalloc_get_fragmentation_ratio(void) {
+ return 0;
+}
+#endif
diff --git a/src/zmalloc.h b/src/zmalloc.h
index db858bba..281aa3a8 100644
--- a/src/zmalloc.h
+++ b/src/zmalloc.h
@@ -38,5 +38,6 @@ void zfree(void *ptr);
char *zstrdup(const char *s);
size_t zmalloc_used_memory(void);
void zmalloc_enable_thread_safeness(void);
+float zmalloc_get_fragmentation_ratio(void);
#endif /* _ZMALLOC_H */
diff --git a/tests/integration/redis-cli.tcl b/tests/integration/redis-cli.tcl
new file mode 100644
index 00000000..40e4222e
--- /dev/null
+++ b/tests/integration/redis-cli.tcl
@@ -0,0 +1,208 @@
+start_server {tags {"cli"}} {
+ proc open_cli {} {
+ set ::env(TERM) dumb
+ set fd [open [format "|src/redis-cli -p %d -n 9" [srv port]] "r+"]
+ fconfigure $fd -buffering none
+ fconfigure $fd -blocking false
+ fconfigure $fd -translation binary
+ assert_equal "redis> " [read_cli $fd]
+ set _ $fd
+ }
+
+ proc close_cli {fd} {
+ close $fd
+ }
+
+ proc read_cli {fd} {
+ set buf [read $fd]
+ while {[string length $buf] == 0} {
+ # wait some time and try again
+ after 10
+ set buf [read $fd]
+ }
+ set _ $buf
+ }
+
+ proc write_cli {fd buf} {
+ puts $fd $buf
+ flush $fd
+ }
+
+ # Helpers to run tests in interactive mode
+ proc run_command {fd cmd} {
+ write_cli $fd $cmd
+ set lines [split [read_cli $fd] "\n"]
+ assert_equal "redis> " [lindex $lines end]
+ join [lrange $lines 0 end-1] "\n"
+ }
+
+ proc test_interactive_cli {name code} {
+ set ::env(FAKETTY) 1
+ set fd [open_cli]
+ test "Interactive CLI: $name" $code
+ close_cli $fd
+ unset ::env(FAKETTY)
+ }
+
+ # Helpers to run tests where stdout is not a tty
+ proc write_tmpfile {contents} {
+ set tmp [tmpfile "cli"]
+ set tmpfd [open $tmp "w"]
+ puts -nonewline $tmpfd $contents
+ close $tmpfd
+ set _ $tmp
+ }
+
+ proc _run_cli {opts args} {
+ set cmd [format "src/redis-cli -p %d -n 9 $args" [srv port]]
+ foreach {key value} $opts {
+ if {$key eq "pipe"} {
+ set cmd "sh -c \"$value | $cmd\""
+ }
+ if {$key eq "path"} {
+ set cmd "$cmd < $value"
+ }
+ }
+
+ set fd [open "|$cmd" "r"]
+ fconfigure $fd -buffering none
+ fconfigure $fd -translation binary
+ set resp [read $fd 1048576]
+ close $fd
+ set _ $resp
+ }
+
+ proc run_cli {args} {
+ _run_cli {} {*}$args
+ }
+
+ proc run_cli_with_input_pipe {cmd args} {
+ _run_cli [list pipe $cmd] {*}$args
+ }
+
+ proc run_cli_with_input_file {path args} {
+ _run_cli [list path $path] {*}$args
+ }
+
+ proc test_nontty_cli {name code} {
+ test "Non-interactive non-TTY CLI: $name" $code
+ }
+
+ # Helpers to run tests where stdout is a tty (fake it)
+ proc test_tty_cli {name code} {
+ set ::env(FAKETTY) 1
+ test "Non-interactive TTY CLI: $name" $code
+ unset ::env(FAKETTY)
+ }
+
+ test_interactive_cli "INFO response should be printed raw" {
+ set lines [split [run_command $fd info] "\n"]
+ foreach line $lines {
+ assert [regexp {^[a-z0-9_]+:[a-z0-9_]+} $line]
+ }
+ }
+
+ test_interactive_cli "Status reply" {
+ assert_equal "OK" [run_command $fd "set key foo"]
+ }
+
+ test_interactive_cli "Integer reply" {
+ assert_equal "(integer) 1" [run_command $fd "incr counter"]
+ }
+
+ test_interactive_cli "Bulk reply" {
+ r set key foo
+ assert_equal "\"foo\"" [run_command $fd "get key"]
+ }
+
+ test_interactive_cli "Multi-bulk reply" {
+ r rpush list foo
+ r rpush list bar
+ assert_equal "1. \"foo\"\n2. \"bar\"" [run_command $fd "lrange list 0 -1"]
+ }
+
+ test_interactive_cli "Parsing quotes" {
+ assert_equal "OK" [run_command $fd "set key \"bar\""]
+ assert_equal "bar" [r get key]
+ assert_equal "OK" [run_command $fd "set key \" bar \""]
+ assert_equal " bar " [r get key]
+ assert_equal "OK" [run_command $fd "set key \"\\\"bar\\\"\""]
+ assert_equal "\"bar\"" [r get key]
+ assert_equal "OK" [run_command $fd "set key \"\tbar\t\""]
+ assert_equal "\tbar\t" [r get key]
+
+ # invalid quotation
+ assert_equal "Invalid argument(s)" [run_command $fd "get \"\"key"]
+ assert_equal "Invalid argument(s)" [run_command $fd "get \"key\"x"]
+
+ # quotes after the argument are weird, but should be allowed
+ assert_equal "OK" [run_command $fd "set key\"\" bar"]
+ assert_equal "bar" [r get key]
+ }
+
+ test_tty_cli "Status reply" {
+ assert_equal "OK\n" [run_cli set key bar]
+ assert_equal "bar" [r get key]
+ }
+
+ test_tty_cli "Integer reply" {
+ r del counter
+ assert_equal "(integer) 1\n" [run_cli incr counter]
+ }
+
+ test_tty_cli "Bulk reply" {
+ r set key "tab\tnewline\n"
+ assert_equal "\"tab\\tnewline\\n\"\n" [run_cli get key]
+ }
+
+ test_tty_cli "Multi-bulk reply" {
+ r del list
+ r rpush list foo
+ r rpush list bar
+ assert_equal "1. \"foo\"\n2. \"bar\"\n" [run_cli lrange list 0 -1]
+ }
+
+ test_tty_cli "Read last argument from pipe" {
+ assert_equal "OK\n" [run_cli_with_input_pipe "echo foo" set key]
+ assert_equal "foo\n" [r get key]
+ }
+
+ test_tty_cli "Read last argument from file" {
+ set tmpfile [write_tmpfile "from file"]
+ assert_equal "OK\n" [run_cli_with_input_file $tmpfile set key]
+ assert_equal "from file" [r get key]
+ }
+
+ test_nontty_cli "Status reply" {
+ assert_equal "OK" [run_cli set key bar]
+ assert_equal "bar" [r get key]
+ }
+
+ test_nontty_cli "Integer reply" {
+ r del counter
+ assert_equal "1" [run_cli incr counter]
+ }
+
+ test_nontty_cli "Bulk reply" {
+ r set key "tab\tnewline\n"
+ assert_equal "tab\tnewline\n" [run_cli get key]
+ }
+
+ test_nontty_cli "Multi-bulk reply" {
+ r del list
+ r rpush list foo
+ r rpush list bar
+ assert_equal "foo\nbar" [run_cli lrange list 0 -1]
+ }
+
+ test_nontty_cli "Read last argument from pipe" {
+ assert_equal "OK" [run_cli_with_input_pipe "echo foo" set key]
+ assert_equal "foo\n" [r get key]
+ }
+
+ test_nontty_cli "Read last argument from file" {
+ set tmpfile [write_tmpfile "from file"]
+ assert_equal "OK" [run_cli_with_input_file $tmpfile set key]
+ assert_equal "from file" [r get key]
+ }
+}
diff --git a/tests/integration/replication.tcl b/tests/integration/replication.tcl
index 4b258825..6ca5a6dd 100644
--- a/tests/integration/replication.tcl
+++ b/tests/integration/replication.tcl
@@ -23,6 +23,24 @@ start_server {tags {"repl"}} {
}
assert_equal [r debug digest] [r -1 debug digest]
}
+
+ test {MASTER and SLAVE consistency with expire} {
+ createComplexDataset r 50000 useexpire
+ after 4000 ;# Make sure everything expired before taking the digest
+ if {[r debug digest] ne [r -1 debug digest]} {
+ set csv1 [csvdump r]
+ set csv2 [csvdump {r -1}]
+ set fd [open /tmp/repldump1.txt w]
+ puts -nonewline $fd $csv1
+ close $fd
+ set fd [open /tmp/repldump2.txt w]
+ puts -nonewline $fd $csv2
+ close $fd
+ puts "Master - Slave inconsistency"
+ puts "Run diff -u against /tmp/repldump*.txt for more info"
+ }
+ assert_equal [r debug digest] [r -1 debug digest]
+ }
}
}
diff --git a/tests/support/server.tcl b/tests/support/server.tcl
index 8e226a7d..e5ca6c6c 100644
--- a/tests/support/server.tcl
+++ b/tests/support/server.tcl
@@ -83,7 +83,9 @@ proc ping_server {host port} {
}
close $fd
} e]} {
- puts "Can't PING server at $host:$port... $e"
+ puts -nonewline "."
+ } else {
+ puts -nonewline "ok"
}
return $retval
}
@@ -170,14 +172,33 @@ proc start_server {options {code undefined}} {
if {$::valgrind} {
exec valgrind src/redis-server $config_file > $stdout 2> $stderr &
- after 2000
} else {
exec src/redis-server $config_file > $stdout 2> $stderr &
- after 500
}
# check that the server actually started
- if {$code ne "undefined" && ![ping_server $::host $::port]} {
+ # ugly but tries to be as fast as possible...
+ set retrynum 20
+ set serverisup 0
+
+ puts -nonewline "=== ($tags) Starting server ${::host}:${::port} "
+ after 10
+ if {$code ne "undefined"} {
+ while {[incr retrynum -1]} {
+ catch {
+ if {[ping_server $::host $::port]} {
+ set serverisup 1
+ }
+ }
+ if {$serverisup} break
+ after 50
+ }
+ } else {
+ set serverisup 1
+ }
+ puts {}
+
+ if {!$serverisup} {
error_and_quit $config_file [exec cat $stderr]
}
@@ -230,7 +251,11 @@ proc start_server {options {code undefined}} {
# execute provided block
set curnum $::testnum
- catch { uplevel 1 $code } err
+ if {![catch { uplevel 1 $code } err]} {
+ # zero exit status is good
+ unset err
+ }
+
if {$curnum == $::testnum} {
# don't check for leaks when no tests were executed
dict set srv "skipleaks" 1
@@ -241,22 +266,24 @@ proc start_server {options {code undefined}} {
# allow an exception to bubble up the call chain but still kill this
# server, because we want to reuse the ports when the tests are re-run
- if {$err eq "exception"} {
- puts [format "Logged warnings (pid %d):" [dict get $srv "pid"]]
- set warnings [warnings_from_file [dict get $srv "stdout"]]
- if {[string length $warnings] > 0} {
- puts "$warnings"
- } else {
- puts "(none)"
+ if {[info exists err]} {
+ if {$err eq "exception"} {
+ puts [format "Logged warnings (pid %d):" [dict get $srv "pid"]]
+ set warnings [warnings_from_file [dict get $srv "stdout"]]
+ if {[string length $warnings] > 0} {
+ puts "$warnings"
+ } else {
+ puts "(none)"
+ }
+ # kill this server without checking for leaks
+ dict set srv "skipleaks" 1
+ kill_server $srv
+ error "exception"
+ } elseif {[string length $err] > 0} {
+ puts "Error executing the suite, aborting..."
+ puts $err
+ exit 1
}
- # kill this server without checking for leaks
- dict set srv "skipleaks" 1
- kill_server $srv
- error "exception"
- } elseif {[string length $err] > 0} {
- puts "Error executing the suite, aborting..."
- puts $err
- exit 1
}
set ::tags [lrange $::tags 0 end-[llength $tags]]
diff --git a/tests/support/test.tcl b/tests/support/test.tcl
index 298e4c77..93f64928 100644
--- a/tests/support/test.tcl
+++ b/tests/support/test.tcl
@@ -36,8 +36,8 @@ proc assert_encoding {enc key} {
# Swapped out values don't have an encoding, so make sure that
# the value is swapped in before checking the encoding.
set dbg [r debug object $key]
- while {[string match "* swapped:*" $dbg]} {
- [r debug swapin $key]
+ while {[string match "* swapped at:*" $dbg]} {
+ r debug swapin $key
set dbg [r debug object $key]
}
assert_match "* encoding:$enc *" $dbg
diff --git a/tests/support/util.tcl b/tests/support/util.tcl
index b9c89aa8..95153111 100644
--- a/tests/support/util.tcl
+++ b/tests/support/util.tcl
@@ -140,12 +140,19 @@ proc findKeyWithType {r type} {
return {}
}
-proc createComplexDataset {r ops} {
+proc createComplexDataset {r ops {opt {}}} {
for {set j 0} {$j < $ops} {incr j} {
set k [randomKey]
set k2 [randomKey]
set f [randomValue]
set v [randomValue]
+
+ if {[lsearch -exact $opt useexpire] != -1} {
+ if {rand() < 0.1} {
+ {*}$r expire [randomKey] [randomInt 2]
+ }
+ }
+
randpath {
set d [expr {rand()}]
} {
diff --git a/tests/test_helper.tcl b/tests/test_helper.tcl
index ef1f9923..ee7fa3e1 100644
--- a/tests/test_helper.tcl
+++ b/tests/test_helper.tcl
@@ -25,7 +25,14 @@ proc execute_tests name {
# are nested, use "srv 0 pid" to get the pid of the inner server. To access
# outer servers, use "srv -1 pid" etcetera.
set ::servers {}
-proc srv {level property} {
+proc srv {args} {
+ set level 0
+ if {[string is integer [lindex $args 0]]} {
+ set level [lindex $args 0]
+ set property [lindex $args 1]
+ } else {
+ set property [lindex $args 0]
+ }
set srv [lindex $::servers end+$level]
dict get $srv $property
}
@@ -88,6 +95,7 @@ proc main {} {
execute_tests "unit/cas"
execute_tests "integration/replication"
execute_tests "integration/aof"
+# execute_tests "integration/redis-cli"
execute_tests "unit/pubsub"
# run tests with VM enabled
diff --git a/tests/unit/basic.tcl b/tests/unit/basic.tcl
index f888cabc..a8f7feb0 100644
--- a/tests/unit/basic.tcl
+++ b/tests/unit/basic.tcl
@@ -148,12 +148,11 @@ start_server {tags {"basic"}} {
r get novar2
} {foobared}
- test {SETNX will overwrite EXPIREing key} {
+ test {SETNX against volatile key} {
r set x 10
r expire x 10000
- r setnx x 20
- r get x
- } {20}
+ list [r setnx x 20] [r get x]
+ } {0 10}
test {EXISTS} {
set res {}
@@ -362,13 +361,6 @@ start_server {tags {"basic"}} {
list [r msetnx x1 xxx y2 yyy] [r get x1] [r get y2]
} {1 xxx yyy}
- test {MSETNX should remove all the volatile keys even on failure} {
- r mset x 1 y 2 z 3
- r expire y 10000
- r expire z 10000
- list [r msetnx x A y B z C] [r mget x y z]
- } {0 {1 {} {}}}
-
test {STRLEN against non existing key} {
r strlen notakey
} {0}
diff --git a/tests/unit/expire.tcl b/tests/unit/expire.tcl
index b80975b6..6f16ed58 100644
--- a/tests/unit/expire.tcl
+++ b/tests/unit/expire.tcl
@@ -1,12 +1,13 @@
start_server {tags {"expire"}} {
- test {EXPIRE - don't set timeouts multiple times} {
+ test {EXPIRE - set timeouts multiple times} {
r set x foobar
set v1 [r expire x 5]
set v2 [r ttl x]
set v3 [r expire x 10]
set v4 [r ttl x]
+ r expire x 4
list $v1 $v2 $v3 $v4
- } {1 5 0 5}
+ } {1 5 1 10}
test {EXPIRE - It should be still possible to read 'x'} {
r get x
@@ -19,13 +20,13 @@ start_server {tags {"expire"}} {
} {{} 0}
}
- test {EXPIRE - Delete on write policy} {
+ test {EXPIRE - write on expire should work} {
r del x
r lpush x foo
r expire x 1000
r lpush x bar
r lrange x 0 -1
- } {bar}
+ } {bar foo}
test {EXPIREAT - Check for EXPIRE alike behavior} {
r del x
@@ -59,4 +60,15 @@ start_server {tags {"expire"}} {
catch {r setex z -10 foo} e
set _ $e
} {*invalid expire*}
+
+ test {PERSIST can undo an EXPIRE} {
+ r set x foo
+ r expire x 50
+ list [r ttl x] [r persist x] [r ttl x] [r get x]
+ } {50 1 -1 foo}
+
+ test {PERSIST returns 0 against non existing or non volatile keys} {
+ r set x foo
+ list [r persist foo] [r persist nokeyatall]
+ } {0 0}
}
diff --git a/tests/unit/other.tcl b/tests/unit/other.tcl
index f0497b62..5967c722 100644
--- a/tests/unit/other.tcl
+++ b/tests/unit/other.tcl
@@ -1,4 +1,4 @@
-start_server {} {
+start_server {tags {"other"}} {
test {SAVE - make sure there are all the types as values} {
# Wait for a background saving in progress to terminate
waitForBgsave r
diff --git a/tests/unit/protocol.tcl b/tests/unit/protocol.tcl
index 8717cd9f..5bf42d7f 100644
--- a/tests/unit/protocol.tcl
+++ b/tests/unit/protocol.tcl
@@ -1,4 +1,4 @@
-start_server {} {
+start_server {tags {"protocol"}} {
test {Handle an empty query well} {
set fd [r channel]
puts -nonewline $fd "\r\n"
@@ -27,6 +27,13 @@ start_server {} {
gets $fd
} {*invalid bulk*count*}
+ test {bulk payload is not a number} {
+ set fd [r channel]
+ puts -nonewline $fd "SET x blabla\r\n"
+ flush $fd
+ gets $fd
+ } {*invalid bulk*count*}
+
test {Multi bulk request not followed by bulk args} {
set fd [r channel]
puts -nonewline $fd "*1\r\nfoo\r\n"
diff --git a/tests/unit/sort.tcl b/tests/unit/sort.tcl
index 16a02b3a..dcc471fb 100644
--- a/tests/unit/sort.tcl
+++ b/tests/unit/sort.tcl
@@ -1,5 +1,101 @@
-start_server {tags {"sort"}} {
- test {SORT ALPHA against integer encoded strings} {
+start_server {
+ tags {"sort"}
+ overrides {
+ "list-max-ziplist-value" 16
+ "list-max-ziplist-entries" 32
+ "set-max-intset-entries" 32
+ }
+} {
+ proc create_random_dataset {num cmd} {
+ set tosort {}
+ set result {}
+ array set seenrand {}
+ r del tosort
+ for {set i 0} {$i < $num} {incr i} {
+ # Make sure all the weights are different because
+ # Redis does not use a stable sort but Tcl does.
+ while 1 {
+ randpath {
+ set rint [expr int(rand()*1000000)]
+ } {
+ set rint [expr rand()]
+ }
+ if {![info exists seenrand($rint)]} break
+ }
+ set seenrand($rint) x
+ r $cmd tosort $i
+ r set weight_$i $rint
+ r hset wobj_$i weight $rint
+ lappend tosort [list $i $rint]
+ }
+ set sorted [lsort -index 1 -real $tosort]
+ for {set i 0} {$i < $num} {incr i} {
+ lappend result [lindex $sorted $i 0]
+ }
+ set _ $result
+ }
+
+ foreach {num cmd enc title} {
+ 16 lpush ziplist "Ziplist"
+ 1000 lpush linkedlist "Linked list"
+ 10000 lpush linkedlist "Big Linked list"
+ 16 sadd intset "Intset"
+ 1000 sadd hashtable "Hash table"
+ 10000 sadd hashtable "Big Hash table"
+ } {
+ set result [create_random_dataset $num $cmd]
+ assert_encoding $enc tosort
+
+ test "$title: SORT BY key" {
+ assert_equal $result [r sort tosort {BY weight_*}]
+ }
+
+ test "$title: SORT BY hash field" {
+ assert_equal $result [r sort tosort {BY wobj_*->weight}]
+ }
+ }
+
+ set result [create_random_dataset 16 lpush]
+ test "SORT GET #" {
+ assert_equal [lsort -integer $result] [r sort tosort GET #]
+ }
+
+ test "SORT GET " {
+ r del foo
+ set res [r sort tosort GET foo]
+ assert_equal 16 [llength $res]
+ foreach item $res { assert_equal {} $item }
+ }
+
+ test "SORT GET (key and hash) with sanity check" {
+ set l1 [r sort tosort GET # GET weight_*]
+ set l2 [r sort tosort GET # GET wobj_*->weight]
+ foreach {id1 w1} $l1 {id2 w2} $l2 {
+ assert_equal $id1 $id2
+ assert_equal $w1 [r get weight_$id1]
+ assert_equal $w2 [r get weight_$id1]
+ }
+ }
+
+ test "SORT BY key STORE" {
+ r sort tosort {BY weight_*} store sort-res
+ assert_equal $result [r lrange sort-res 0 -1]
+ assert_equal 16 [r llen sort-res]
+ assert_encoding ziplist sort-res
+ }
+
+ test "SORT BY hash field STORE" {
+ r sort tosort {BY wobj_*->weight} store sort-res
+ assert_equal $result [r lrange sort-res 0 -1]
+ assert_equal 16 [r llen sort-res]
+ assert_encoding ziplist sort-res
+ }
+
+ test "SORT DESC" {
+ assert_equal [lsort -decreasing -integer $result] [r sort tosort {DESC}]
+ }
+
+ test "SORT ALPHA against integer encoded strings" {
r del mylist
r lpush mylist 2
r lpush mylist 1
@@ -8,155 +104,7 @@ start_server {tags {"sort"}} {
r sort mylist alpha
} {1 10 2 3}
- tags {"slow"} {
- set res {}
- test {Create a random list and a random set} {
- set tosort {}
- array set seenrand {}
- for {set i 0} {$i < 10000} {incr i} {
- while 1 {
- # Make sure all the weights are different because
- # Redis does not use a stable sort but Tcl does.
- randpath {
- set rint [expr int(rand()*1000000)]
- } {
- set rint [expr rand()]
- }
- if {![info exists seenrand($rint)]} break
- }
- set seenrand($rint) x
- r lpush tosort $i
- r sadd tosort-set $i
- r set weight_$i $rint
- r hset wobj_$i weight $rint
- lappend tosort [list $i $rint]
- }
- set sorted [lsort -index 1 -real $tosort]
- for {set i 0} {$i < 10000} {incr i} {
- lappend res [lindex $sorted $i 0]
- }
- format {}
- } {}
-
- test {SORT with BY against the newly created list} {
- r sort tosort {BY weight_*}
- } $res
-
- test {SORT with BY (hash field) against the newly created list} {
- r sort tosort {BY wobj_*->weight}
- } $res
-
- test {SORT with GET (key+hash) with sanity check of each element (list)} {
- set err {}
- set l1 [r sort tosort GET # GET weight_*]
- set l2 [r sort tosort GET # GET wobj_*->weight]
- foreach {id1 w1} $l1 {id2 w2} $l2 {
- set realweight [r get weight_$id1]
- if {$id1 != $id2} {
- set err "ID mismatch $id1 != $id2"
- break
- }
- if {$realweight != $w1 || $realweight != $w2} {
- set err "Weights mismatch! w1: $w1 w2: $w2 real: $realweight"
- break
- }
- }
- set _ $err
- } {}
-
- test {SORT with BY, but against the newly created set} {
- r sort tosort-set {BY weight_*}
- } $res
-
- test {SORT with BY (hash field), but against the newly created set} {
- r sort tosort-set {BY wobj_*->weight}
- } $res
-
- test {SORT with BY and STORE against the newly created list} {
- r sort tosort {BY weight_*} store sort-res
- r lrange sort-res 0 -1
- } $res
-
- test {SORT with BY (hash field) and STORE against the newly created list} {
- r sort tosort {BY wobj_*->weight} store sort-res
- r lrange sort-res 0 -1
- } $res
-
- test {SORT direct, numeric, against the newly created list} {
- r sort tosort
- } [lsort -integer $res]
-
- test {SORT decreasing sort} {
- r sort tosort {DESC}
- } [lsort -decreasing -integer $res]
-
- test {SORT speed, sorting 10000 elements list using BY, 100 times} {
- set start [clock clicks -milliseconds]
- for {set i 0} {$i < 100} {incr i} {
- set sorted [r sort tosort {BY weight_* LIMIT 0 10}]
- }
- set elapsed [expr [clock clicks -milliseconds]-$start]
- puts -nonewline "\n Average time to sort: [expr double($elapsed)/100] milliseconds "
- flush stdout
- format {}
- } {}
-
- test {SORT speed, as above but against hash field} {
- set start [clock clicks -milliseconds]
- for {set i 0} {$i < 100} {incr i} {
- set sorted [r sort tosort {BY wobj_*->weight LIMIT 0 10}]
- }
- set elapsed [expr [clock clicks -milliseconds]-$start]
- puts -nonewline "\n Average time to sort: [expr double($elapsed)/100] milliseconds "
- flush stdout
- format {}
- } {}
-
- test {SORT speed, sorting 10000 elements list directly, 100 times} {
- set start [clock clicks -milliseconds]
- for {set i 0} {$i < 100} {incr i} {
- set sorted [r sort tosort {LIMIT 0 10}]
- }
- set elapsed [expr [clock clicks -milliseconds]-$start]
- puts -nonewline "\n Average time to sort: [expr double($elapsed)/100] milliseconds "
- flush stdout
- format {}
- } {}
-
- test {SORT speed, pseudo-sorting 10000 elements list, BY , 100 times} {
- set start [clock clicks -milliseconds]
- for {set i 0} {$i < 100} {incr i} {
- set sorted [r sort tosort {BY nokey LIMIT 0 10}]
- }
- set elapsed [expr [clock clicks -milliseconds]-$start]
- puts -nonewline "\n Average time to sort: [expr double($elapsed)/100] milliseconds "
- flush stdout
- format {}
- } {}
- }
-
- test {SORT regression for issue #19, sorting floats} {
- r flushdb
- foreach x {1.1 5.10 3.10 7.44 2.1 5.75 6.12 0.25 1.15} {
- r lpush mylist $x
- }
- r sort mylist
- } [lsort -real {1.1 5.10 3.10 7.44 2.1 5.75 6.12 0.25 1.15}]
-
- test {SORT with GET #} {
- r del mylist
- r lpush mylist 1
- r lpush mylist 2
- r lpush mylist 3
- r mset weight_1 10 weight_2 5 weight_3 30
- r sort mylist BY weight_* GET #
- } {2 1 3}
-
- test {SORT with constant GET} {
- r sort mylist GET foo
- } {{} {} {}}
-
- test {SORT against sorted sets} {
+ test "SORT sorted set" {
r del zset
r zadd zset 1 a
r zadd zset 5 b
@@ -166,7 +114,7 @@ start_server {tags {"sort"}} {
r sort zset alpha desc
} {e d c b a}
- test {Sorted sets +inf and -inf handling} {
+ test "SORT sorted set: +inf and -inf handling" {
r del zset
r zadd zset -100 a
r zadd zset 200 b
@@ -176,4 +124,58 @@ start_server {tags {"sort"}} {
r zadd zset -inf min
r zrange zset 0 -1
} {min c a b d max}
+
+ test "SORT regression for issue #19, sorting floats" {
+ r flushdb
+ set floats {1.1 5.10 3.10 7.44 2.1 5.75 6.12 0.25 1.15}
+ foreach x $floats {
+ r lpush mylist $x
+ }
+ assert_equal [lsort -real $floats] [r sort mylist]
+ }
+
+ tags {"slow"} {
+ set num 100
+ set res [create_random_dataset $num lpush]
+
+ test "SORT speed, $num element list BY key, 100 times" {
+ set start [clock clicks -milliseconds]
+ for {set i 0} {$i < 100} {incr i} {
+ set sorted [r sort tosort {BY weight_* LIMIT 0 10}]
+ }
+ set elapsed [expr [clock clicks -milliseconds]-$start]
+ puts -nonewline "\n Average time to sort: [expr double($elapsed)/100] milliseconds "
+ flush stdout
+ }
+
+ test "SORT speed, $num element list BY hash field, 100 times" {
+ set start [clock clicks -milliseconds]
+ for {set i 0} {$i < 100} {incr i} {
+ set sorted [r sort tosort {BY wobj_*->weight LIMIT 0 10}]
+ }
+ set elapsed [expr [clock clicks -milliseconds]-$start]
+ puts -nonewline "\n Average time to sort: [expr double($elapsed)/100] milliseconds "
+ flush stdout
+ }
+
+ test "SORT speed, $num element list directly, 100 times" {
+ set start [clock clicks -milliseconds]
+ for {set i 0} {$i < 100} {incr i} {
+ set sorted [r sort tosort {LIMIT 0 10}]
+ }
+ set elapsed [expr [clock clicks -milliseconds]-$start]
+ puts -nonewline "\n Average time to sort: [expr double($elapsed)/100] milliseconds "
+ flush stdout
+ }
+
+ test "SORT speed, $num element list BY , 100 times" {
+ set start [clock clicks -milliseconds]
+ for {set i 0} {$i < 100} {incr i} {
+ set sorted [r sort tosort {BY nokey LIMIT 0 10}]
+ }
+ set elapsed [expr [clock clicks -milliseconds]-$start]
+ puts -nonewline "\n Average time to sort: [expr double($elapsed)/100] milliseconds "
+ flush stdout
+ }
+ }
}
diff --git a/tests/unit/type/hash.tcl b/tests/unit/type/hash.tcl
index ef49a27d..2c0bd534 100644
--- a/tests/unit/type/hash.tcl
+++ b/tests/unit/type/hash.tcl
@@ -15,8 +15,8 @@ start_server {tags {"hash"}} {
} {8}
test {Is the small hash encoded with a zipmap?} {
- r debug object smallhash
- } {*zipmap*}
+ assert_encoding zipmap smallhash
+ }
test {HSET/HLEN - Big hash creation} {
array set bighash {}
@@ -34,8 +34,8 @@ start_server {tags {"hash"}} {
} {1024}
test {Is the big hash encoded with a zipmap?} {
- r debug object bighash
- } {*hashtable*}
+ assert_encoding hashtable bighash
+ }
test {HGET against the small hash} {
set err {}
diff --git a/tests/unit/type/list.tcl b/tests/unit/type/list.tcl
index d3ed90ec..4c131fc3 100644
--- a/tests/unit/type/list.tcl
+++ b/tests/unit/type/list.tcl
@@ -139,6 +139,28 @@ start_server {
assert_equal 0 [r exists blist1]
}
+ test "$pop: with negative timeout" {
+ set rd [redis_deferring_client]
+ $rd $pop blist1 -1
+ assert_error "ERR*is negative*" {$rd read}
+ }
+
+ test "$pop: with non-integer timeout" {
+ set rd [redis_deferring_client]
+ $rd $pop blist1 1.1
+ assert_error "ERR*not an integer*" {$rd read}
+ }
+
+ test "$pop: with zero timeout should block indefinitely" {
+ # To test this, use a timeout of 0 and wait a second.
+ # The blocking pop should still be waiting for a push.
+ set rd [redis_deferring_client]
+ $rd $pop blist1 0
+ after 1000
+ r rpush blist1 foo
+ assert_equal {blist1 foo} [$rd read]
+ }
+
test "$pop: second argument is not a list" {
set rd [redis_deferring_client]
r del blist1 blist2
@@ -172,6 +194,17 @@ start_server {
}
}
+ test {BLPOP inside a transaction} {
+ r del xlist
+ r lpush xlist foo
+ r lpush xlist bar
+ r multi
+ r blpop xlist 0
+ r blpop xlist 0
+ r blpop xlist 0
+ r exec
+ } {{xlist bar} {xlist foo} {}}
+
test {LPUSHX, RPUSHX - generic} {
r del xlist
assert_equal 0 [r lpushx xlist a]
@@ -570,5 +603,76 @@ start_server {
assert_equal 1 [r lrem myotherlist 1 2]
assert_equal 3 [r llen myotherlist]
}
+
+ }
+}
+
+start_server {
+ tags {list ziplist}
+ overrides {
+ "list-max-ziplist-value" 200000
+ "list-max-ziplist-entries" 256
+ }
+} {
+ test {Explicit regression for a list bug} {
+ set mylist {49376042582 {BkG2o\pIC]4YYJa9cJ4GWZalG[4tin;1D2whSkCOW`mX;SFXGyS8sedcff3fQI^tgPCC@^Nu1J6o]meM@Lko]t_jRyotK?tH[\EvWqS]b`o2OCtjg:?nUTwdjpcUm]y:pg5q24q7LlCOwQE^}}
+ r del l
+ r rpush l [lindex $mylist 0]
+ r rpush l [lindex $mylist 1]
+ assert_equal [r lindex l 0] [lindex $mylist 0]
+ assert_equal [r lindex l 1] [lindex $mylist 1]
+ }
+
+ tags {slow} {
+ test {ziplist implementation: value encoding and backlink} {
+ for {set j 0} {$j < 100} {incr j} {
+ r del l
+ set l {}
+ for {set i 0} {$i < 200} {incr i} {
+ randpath {
+ set data [string repeat x [randomInt 100000]]
+ } {
+ set data [randomInt 65536]
+ } {
+ set data [randomInt 4294967296]
+ } {
+ set data [randomInt 18446744073709551616]
+ }
+ lappend l $data
+ r rpush l $data
+ }
+ assert_equal [llength $l] [r llen l]
+ # Traverse backward
+ for {set i 199} {$i >= 0} {incr i -1} {
+ if {[lindex $l $i] ne [r lindex l $i]} {
+ assert_equal [lindex $l $i] [r lindex l $i]
+ }
+ }
+ }
+ }
+
+ test {ziplist implementation: encoding stress testing} {
+ for {set j 0} {$j < 200} {incr j} {
+ r del l
+ set l {}
+ set len [randomInt 400]
+ for {set i 0} {$i < $len} {incr i} {
+ set rv [randomValue]
+ randpath {
+ lappend l $rv
+ r rpush l $rv
+ } {
+ set l [concat [list $rv] $l]
+ r lpush l $rv
+ }
+ }
+ assert_equal [llength $l] [r llen l]
+ for {set i 0} {$i < 200} {incr i} {
+ if {[lindex $l $i] ne [r lindex l $i]} {
+ assert_equal [lindex $l $i] [r lindex l $i]
+ }
+ }
+ }
+ }
}
}
diff --git a/tests/unit/type/set.tcl b/tests/unit/type/set.tcl
index 58ea2b5b..5608a648 100644
--- a/tests/unit/type/set.tcl
+++ b/tests/unit/type/set.tcl
@@ -1,151 +1,334 @@
-start_server {tags {"set"}} {
- test {SADD, SCARD, SISMEMBER, SMEMBERS basics} {
- r sadd myset foo
- r sadd myset bar
- list [r scard myset] [r sismember myset foo] \
- [r sismember myset bar] [r sismember myset bla] \
- [lsort [r smembers myset]]
- } {2 1 1 0 {bar foo}}
+start_server {
+ tags {"set"}
+ overrides {
+ "set-max-intset-entries" 512
+ }
+} {
+ proc create_set {key entries} {
+ r del $key
+ foreach entry $entries { r sadd $key $entry }
+ }
- test {SADD adding the same element multiple times} {
- r sadd myset foo
- r sadd myset foo
- r sadd myset foo
- r scard myset
- } {2}
+ test {SADD, SCARD, SISMEMBER, SMEMBERS basics - regular set} {
+ create_set myset {foo}
+ assert_encoding hashtable myset
+ assert_equal 1 [r sadd myset bar]
+ assert_equal 0 [r sadd myset bar]
+ assert_equal 2 [r scard myset]
+ assert_equal 1 [r sismember myset foo]
+ assert_equal 1 [r sismember myset bar]
+ assert_equal 0 [r sismember myset bla]
+ assert_equal {bar foo} [lsort [r smembers myset]]
+ }
+
+ test {SADD, SCARD, SISMEMBER, SMEMBERS basics - intset} {
+ create_set myset {17}
+ assert_encoding intset myset
+ assert_equal 1 [r sadd myset 16]
+ assert_equal 0 [r sadd myset 16]
+ assert_equal 2 [r scard myset]
+ assert_equal 1 [r sismember myset 16]
+ assert_equal 1 [r sismember myset 17]
+ assert_equal 0 [r sismember myset 18]
+ assert_equal {16 17} [lsort [r smembers myset]]
+ }
test {SADD against non set} {
r lpush mylist foo
- catch {r sadd mylist bar} err
- format $err
- } {ERR*kind*}
+ assert_error ERR*kind* {r sadd mylist bar}
+ }
- test {SREM basics} {
- r sadd myset ciao
- r srem myset foo
- lsort [r smembers myset]
- } {bar ciao}
+ test "SADD a non-integer against an intset" {
+ create_set myset {1 2 3}
+ assert_encoding intset myset
+ assert_equal 1 [r sadd myset a]
+ assert_encoding hashtable myset
+ }
- test {Mass SADD and SINTER with two sets} {
- for {set i 0} {$i < 1000} {incr i} {
- r sadd set1 $i
- r sadd set2 [expr $i+995]
- }
- lsort [r sinter set1 set2]
- } {995 996 997 998 999}
+ test "SADD an integer larger than 64 bits" {
+ create_set myset {213244124402402314402033402}
+ assert_encoding hashtable myset
+ assert_equal 1 [r sismember myset 213244124402402314402033402]
+ }
- test {SUNION with two sets} {
- lsort [r sunion set1 set2]
- } [lsort -uniq "[r smembers set1] [r smembers set2]"]
+ test "SADD overflows the maximum allowed integers in an intset" {
+ r del myset
+ for {set i 0} {$i < 512} {incr i} { r sadd myset $i }
+ assert_encoding intset myset
+ assert_equal 1 [r sadd myset 512]
+ assert_encoding hashtable myset
+ }
- test {SINTERSTORE with two sets} {
- r sinterstore setres set1 set2
- lsort [r smembers setres]
- } {995 996 997 998 999}
+ test "Set encoding after DEBUG RELOAD" {
+ r del myintset myhashset mylargeintset
+ for {set i 0} {$i < 100} {incr i} { r sadd myintset $i }
+ for {set i 0} {$i < 1280} {incr i} { r sadd mylargeintset $i }
+ for {set i 0} {$i < 256} {incr i} { r sadd myhashset [format "i%03d" $i] }
+ assert_encoding intset myintset
+ assert_encoding hashtable mylargeintset
+ assert_encoding hashtable myhashset
- test {SINTERSTORE with two sets, after a DEBUG RELOAD} {
r debug reload
- r sinterstore setres set1 set2
- lsort [r smembers setres]
- } {995 996 997 998 999}
+ assert_encoding intset myintset
+ assert_encoding hashtable mylargeintset
+ assert_encoding hashtable myhashset
+ }
- test {SUNIONSTORE with two sets} {
- r sunionstore setres set1 set2
- lsort [r smembers setres]
- } [lsort -uniq "[r smembers set1] [r smembers set2]"]
+ test {SREM basics - regular set} {
+ create_set myset {foo bar ciao}
+ assert_encoding hashtable myset
+ assert_equal 0 [r srem myset qux]
+ assert_equal 1 [r srem myset foo]
+ assert_equal {bar ciao} [lsort [r smembers myset]]
+ }
- test {SUNIONSTORE against non existing keys} {
- r set setres xxx
- list [r sunionstore setres foo111 bar222] [r exists xxx]
- } {0 0}
+ test {SREM basics - intset} {
+ create_set myset {3 4 5}
+ assert_encoding intset myset
+ assert_equal 0 [r srem myset 6]
+ assert_equal 1 [r srem myset 4]
+ assert_equal {3 5} [lsort [r smembers myset]]
+ }
- test {SINTER against three sets} {
- r sadd set3 999
- r sadd set3 995
- r sadd set3 1000
- r sadd set3 2000
- lsort [r sinter set1 set2 set3]
- } {995 999}
-
- test {SINTERSTORE with three sets} {
- r sinterstore setres set1 set2 set3
- lsort [r smembers setres]
- } {995 999}
-
- test {SUNION with non existing keys} {
- lsort [r sunion nokey1 set1 set2 nokey2]
- } [lsort -uniq "[r smembers set1] [r smembers set2]"]
-
- test {SDIFF with two sets} {
- for {set i 5} {$i < 1000} {incr i} {
+ foreach {type} {hashtable intset} {
+ for {set i 1} {$i <= 5} {incr i} {
+ r del [format "set%d" $i]
+ }
+ for {set i 0} {$i < 200} {incr i} {
+ r sadd set1 $i
+ r sadd set2 [expr $i+195]
+ }
+ foreach i {199 195 1000 2000} {
+ r sadd set3 $i
+ }
+ for {set i 5} {$i < 200} {incr i} {
r sadd set4 $i
}
- lsort [r sdiff set1 set4]
- } {0 1 2 3 4}
-
- test {SDIFF with three sets} {
r sadd set5 0
- lsort [r sdiff set1 set4 set5]
- } {1 2 3 4}
- test {SDIFFSTORE with three sets} {
- r sdiffstore sres set1 set4 set5
- lsort [r smembers sres]
- } {1 2 3 4}
-
- test {SPOP basics} {
- r del myset
- r sadd myset 1
- r sadd myset 2
- r sadd myset 3
- list [lsort [list [r spop myset] [r spop myset] [r spop myset]]] [r scard myset]
- } {{1 2 3} 0}
-
- test {SRANDMEMBER} {
- r del myset
- r sadd myset a
- r sadd myset b
- r sadd myset c
- unset -nocomplain myset
- array set myset {}
- for {set i 0} {$i < 100} {incr i} {
- set myset([r srandmember myset]) 1
+ # To make sure the sets are encoded as the type we are testing -- also
+ # when the VM is enabled and the values may be swapped in and out
+ # while the tests are running -- an extra element is added to every
+ # set that determines its encoding.
+ set large 200
+ if {$type eq "hashtable"} {
+ set large foo
}
- lsort [array names myset]
- } {a b c}
-
- test {SMOVE basics} {
- r sadd myset1 a
- r sadd myset1 b
- r sadd myset1 c
- r sadd myset2 x
- r sadd myset2 y
- r sadd myset2 z
- r smove myset1 myset2 a
- list [lsort [r smembers myset2]] [lsort [r smembers myset1]]
- } {{a x y z} {b c}}
- test {SMOVE non existing key} {
- list [r smove myset1 myset2 foo] [lsort [r smembers myset2]] [lsort [r smembers myset1]]
- } {0 {a x y z} {b c}}
+ for {set i 1} {$i <= 5} {incr i} {
+ r sadd [format "set%d" $i] $large
+ }
- test {SMOVE non existing src set} {
- list [r smove noset myset2 foo] [lsort [r smembers myset2]]
- } {0 {a x y z}}
+ test "Generated sets must be encoded as $type" {
+ for {set i 1} {$i <= 5} {incr i} {
+ assert_encoding $type [format "set%d" $i]
+ }
+ }
- test {SMOVE non existing dst set} {
- list [r smove myset2 myset3 y] [lsort [r smembers myset2]] [lsort [r smembers myset3]]
- } {1 {a x z} y}
+ test "SINTER with two sets - $type" {
+ assert_equal [list 195 196 197 198 199 $large] [lsort [r sinter set1 set2]]
+ }
- test {SMOVE wrong src key type} {
+ test "SINTERSTORE with two sets - $type" {
+ r sinterstore setres set1 set2
+ assert_encoding $type setres
+ assert_equal [list 195 196 197 198 199 $large] [lsort [r smembers setres]]
+ }
+
+ test "SINTERSTORE with two sets, after a DEBUG RELOAD - $type" {
+ r debug reload
+ r sinterstore setres set1 set2
+ assert_encoding $type setres
+ assert_equal [list 195 196 197 198 199 $large] [lsort [r smembers setres]]
+ }
+
+ test "SUNION with two sets - $type" {
+ set expected [lsort -uniq "[r smembers set1] [r smembers set2]"]
+ assert_equal $expected [lsort [r sunion set1 set2]]
+ }
+
+ test "SUNIONSTORE with two sets - $type" {
+ r sunionstore setres set1 set2
+ assert_encoding $type setres
+ set expected [lsort -uniq "[r smembers set1] [r smembers set2]"]
+ assert_equal $expected [lsort [r smembers setres]]
+ }
+
+ test "SINTER against three sets - $type" {
+ assert_equal [list 195 199 $large] [lsort [r sinter set1 set2 set3]]
+ }
+
+ test "SINTERSTORE with three sets - $type" {
+ r sinterstore setres set1 set2 set3
+ assert_equal [list 195 199 $large] [lsort [r smembers setres]]
+ }
+
+ test "SUNION with non existing keys - $type" {
+ set expected [lsort -uniq "[r smembers set1] [r smembers set2]"]
+ assert_equal $expected [lsort [r sunion nokey1 set1 set2 nokey2]]
+ }
+
+ test "SDIFF with two sets - $type" {
+ assert_equal {0 1 2 3 4} [lsort [r sdiff set1 set4]]
+ }
+
+ test "SDIFF with three sets - $type" {
+ assert_equal {1 2 3 4} [lsort [r sdiff set1 set4 set5]]
+ }
+
+ test "SDIFFSTORE with three sets - $type" {
+ r sdiffstore setres set1 set4 set5
+ # The type is determined by type of the first key to diff against.
+ # See the implementation for more information.
+ assert_encoding $type setres
+ assert_equal {1 2 3 4} [lsort [r smembers setres]]
+ }
+ }
+
+ test "SINTER against non-set should throw error" {
+ r set key1 x
+ assert_error "ERR*wrong kind*" {r sinter key1 noset}
+ }
+
+ test "SUNION against non-set should throw error" {
+ r set key1 x
+ assert_error "ERR*wrong kind*" {r sunion key1 noset}
+ }
+
+ test "SINTERSTORE against non existing keys should delete dstkey" {
+ r set setres xxx
+ assert_equal 0 [r sinterstore setres foo111 bar222]
+ assert_equal 0 [r exists setres]
+ }
+
+ test "SUNIONSTORE against non existing keys should delete dstkey" {
+ r set setres xxx
+ assert_equal 0 [r sunionstore setres foo111 bar222]
+ assert_equal 0 [r exists setres]
+ }
+
+ foreach {type contents} {hashtable {a b c} intset {1 2 3}} {
+ test "SPOP basics - $type" {
+ create_set myset $contents
+ assert_encoding $type myset
+ assert_equal $contents [lsort [list [r spop myset] [r spop myset] [r spop myset]]]
+ assert_equal 0 [r scard myset]
+ }
+
+ test "SRANDMEMBER - $type" {
+ create_set myset $contents
+ unset -nocomplain myset
+ array set myset {}
+ for {set i 0} {$i < 100} {incr i} {
+ set myset([r srandmember myset]) 1
+ }
+ assert_equal $contents [lsort [array names myset]]
+ }
+ }
+
+ proc setup_move {} {
+ r del myset3 myset4
+ create_set myset1 {1 a b}
+ create_set myset2 {2 3 4}
+ assert_encoding hashtable myset1
+ assert_encoding intset myset2
+ }
+
+ test "SMOVE basics - from regular set to intset" {
+ # move a non-integer element to an intset should convert encoding
+ setup_move
+ assert_equal 1 [r smove myset1 myset2 a]
+ assert_equal {1 b} [lsort [r smembers myset1]]
+ assert_equal {2 3 4 a} [lsort [r smembers myset2]]
+ assert_encoding hashtable myset2
+
+ # move an integer element should not convert the encoding
+ setup_move
+ assert_equal 1 [r smove myset1 myset2 1]
+ assert_equal {a b} [lsort [r smembers myset1]]
+ assert_equal {1 2 3 4} [lsort [r smembers myset2]]
+ assert_encoding intset myset2
+ }
+
+ test "SMOVE basics - from intset to regular set" {
+ setup_move
+ assert_equal 1 [r smove myset2 myset1 2]
+ assert_equal {1 2 a b} [lsort [r smembers myset1]]
+ assert_equal {3 4} [lsort [r smembers myset2]]
+ }
+
+ test "SMOVE non existing key" {
+ setup_move
+ assert_equal 0 [r smove myset1 myset2 foo]
+ assert_equal {1 a b} [lsort [r smembers myset1]]
+ assert_equal {2 3 4} [lsort [r smembers myset2]]
+ }
+
+ test "SMOVE non existing src set" {
+ setup_move
+ assert_equal 0 [r smove noset myset2 foo]
+ assert_equal {2 3 4} [lsort [r smembers myset2]]
+ }
+
+ test "SMOVE from regular set to non existing destination set" {
+ setup_move
+ assert_equal 1 [r smove myset1 myset3 a]
+ assert_equal {1 b} [lsort [r smembers myset1]]
+ assert_equal {a} [lsort [r smembers myset3]]
+ assert_encoding hashtable myset3
+ }
+
+ test "SMOVE from intset to non existing destination set" {
+ setup_move
+ assert_equal 1 [r smove myset2 myset3 2]
+ assert_equal {3 4} [lsort [r smembers myset2]]
+ assert_equal {2} [lsort [r smembers myset3]]
+ assert_encoding intset myset3
+ }
+
+ test "SMOVE wrong src key type" {
r set x 10
- catch {r smove x myset2 foo} err
- format $err
- } {ERR*}
+ assert_error "ERR*wrong kind*" {r smove x myset2 foo}
+ }
- test {SMOVE wrong dst key type} {
+ test "SMOVE wrong dst key type" {
r set x 10
- catch {r smove myset2 x foo} err
- format $err
- } {ERR*}
+ assert_error "ERR*wrong kind*" {r smove myset2 x foo}
+ }
+
+ tags {slow} {
+ test {intsets implementation stress testing} {
+ for {set j 0} {$j < 20} {incr j} {
+ unset -nocomplain s
+ array set s {}
+ r del s
+ set len [randomInt 1024]
+ for {set i 0} {$i < $len} {incr i} {
+ randpath {
+ set data [randomInt 65536]
+ } {
+ set data [randomInt 4294967296]
+ } {
+ set data [randomInt 18446744073709551616]
+ }
+ set s($data) {}
+ r sadd s $data
+ }
+ assert_equal [lsort [r smembers s]] [lsort [array names s]]
+ set len [array size s]
+ for {set i 0} {$i < $len} {incr i} {
+ set e [r spop s]
+ if {![info exists s($e)]} {
+ puts "Can't find '$e' on local array"
+ puts "Local array: [lsort [r smembers s]]"
+ puts "Remote array: [lsort [array names s]]"
+ error "exception"
+ }
+ array unset s $e
+ }
+ assert_equal [r scard s] 0
+ assert_equal [array size s] 0
+ }
+ }
+ }
}
diff --git a/utils/redis-copy.rb b/utils/redis-copy.rb
index af214b79..d892e377 100644
--- a/utils/redis-copy.rb
+++ b/utils/redis-copy.rb
@@ -1,12 +1,10 @@
-# redis-sha1.rb - Copyright (C) 2009 Salvatore Sanfilippo
+# redis-copy.rb - Copyright (C) 2009-2010 Salvatore Sanfilippo
# BSD license, See the COPYING file for more information.
#
-# Performs the SHA1 sum of the whole datset.
-# This is useful to spot bugs in persistence related code and to make sure
-# Slaves and Masters are in SYNC.
+# Copy the whole dataset from one Redis instance to another one
#
-# If you hack this code make sure to sort keys and set elements as this are
-# unsorted elements. Otherwise the sum may differ with equal dataset.
+# WARNING: currently hashes and sorted sets are not supported! This
+# program should be updated.
require 'rubygems'
require 'redis'
diff --git a/utils/redis_init_script b/utils/redis_init_script
index 35b906fc..b1c56002 100755
--- a/utils/redis_init_script
+++ b/utils/redis_init_script
@@ -21,15 +21,14 @@ case "$1" in
then
echo -n "$PIDFILE does not exist, process is not running\n"
else
+ PID=$(cat $PIDFILE)
echo -n "Stopping ...\n"
- echo -n "Sending SHUTDOWN\r\n" | nc localhost $REDISPORT &
- PID=$(cat $PIDFILE)
+ echo -n "SHUTDOWN\r\n" | nc localhost $REDISPORT &
while [ -x /proc/${PIDFILE} ]
do
echo "Waiting for Redis to shutdown ..."
sleep 1
done
- rm $PIDFILE
echo "Redis stopped"
fi
;;