This is what happened:
1. Instance starts, is a slave in the cluster configuration, but
actually server.masterhost is not set, so technically the instance
is acting like a master.
2. loadDataFromDisk() calls replicationCacheMasterUsingMyself() even if
the instance is a master, in the case it is logically a slave and the
cluster is enabled. So now we have a cached master even if the instance
is practically configured as a master (from the POV of
server.masterhost value and so forth).
3. clusterCron() sees that the instance requires to replicate from its
master, because logically it is a slave, so it calls
replicationSetMaster() that will in turn call
replicationCacheMasterUsingMyself(): before this commit, this call would
overwrite the old cached master, creating a memory leak.
cluster.c - stack buffer memory alignment
The pointer 'buf' is cast to a more strictly aligned pointer type
evict.c - lazyfree_lazy_eviction, lazyfree_lazy_eviction always called
defrag.c - bug in dead code
server.c - casting was missing parenthesis
rax.c - indentation / newline suggested an 'else if' was intended
since the slowlog and other means that can help you detect the bad script
are only exposed after the script is done. it might be a good idea to at least
print the script name (sha) to the log when it timeouts.
The correct way to access the module about a given IO context is to
deference io->type->module, since io->ctx is only populated if the user
requests an explicit context from an IO object.
We don't want that the API could be used directly in an unsafe way,
without checking if there is an active child. Now the safety checks are
moved directly in the function performing the operations.
In theory currently there is only one active child, but the API may
change or for bugs in the implementation we may have several (it was
like that for years because of a bug). Better to wait for a specific
pid and avoid consuing other pending children information.
We can't expect SIGUSR1 to have any specific value range, so let's
define an exit code that we can handle in a special way.
This also fixes an #include <wait.h> that is not standard.