Redis中的used_memory与maxmemory解惑

概述

在Redis2.X系列和3.X系列执行info Memory，会打印出来一些Redis内存使用情况的信息，在2.X系列中显示如下：

127.0.0.1:6379> info Memory
used_memory:279440336
used_memory_human:266.50M
used_memory_rss:295079936
used_memory_peak:298650696
used_memory_peak_human:284.82M
used_memory_lua:36864
mem_fragmentation_ratio:1.06
mem_allocator:jemalloc-3.6.0

在3.X系列中显示如下：
used_memory:22259232
used_memory_human:21.23M
used_memory_rss:58331136
used_memory_rss_human:55.63M
used_memory_peak:98079600
used_memory_peak_human:93.54M
total_system_memory:7847321600
total_system_memory_human:7.31G
used_memory_lua:37888
used_memory_lua_human:37.00K
maxmemory:6000000000
maxmemory_human:5.59G
maxmemory_policy:noeviction
mem_fragmentation_ratio:2.62
mem_allocator:jemalloc-4.0.3

对比观察发现在3.X系列中，主要增加了system memory和maxmemory及maxmemory淘汰策略，在2.X系列中，获取这些值需要通过CONFIG GET maxmemory*来获取配置项。那么maxmemory和used_memory
是什么关系呢，used_memory为什么会超过maxmemory呢，为了解开这个疑惑，我们将从源码出发，详细的解读used_memory和maxmemory的关系。

Memory是如何统计的

在Redis的实现中并没有为每个Object设计大小计数器，而是充分利用了Redis单进程模型的特点，直接统计的进程占用的内存大小，相关代码如下：

size_t zmalloc_used_memory(void) {
    size_t um;
    if (zmalloc_thread_safe) {
#if defined(__ATOMIC_RELAXED) || defined(HAVE_ATOMIC)
        um = update_zmalloc_stat_add(0);
#else
        pthread_mutex_lock(&used_memory_mutex);
        um = used_memory;
        pthread_mutex_unlock(&used_memory_mutex);
#endif
    }
    else {
        um = used_memory;
    }
    return um;
}

maxmemory是我们在配置文件中或者通过config命令进行配置的，最终都保存在redisServer的结构体中，代码如下:

/* Limits */
   unsigned int maxclients;            /* Max number of simultaneous clients */
   unsigned long long maxmemory;       /* Max number of memory bytes to use */
   int maxmemory_policy;               /* Policy for key eviction */
   int maxmemory_samples;              /* Pricision of random sampling */

什么时候触发maxmemory

Redis的作者在实现时，讲maxmemory判断策略放在了建立连接的函数中processCommand,这意味着每次新建立连接，Redis都会做maxmemory判断，具体实现是用freeMemoryIfNeeded函数实现，但是在
这个函数中，我们发现了一些好玩的事情，代码如下：

int freeMemoryIfNeeded(void) {
    size_t mem_used, mem_tofree, mem_freed;
    int slaves = listLength(server.slaves);
    mstime_t latency, eviction_latency;
    /* Remove the size of slaves output buffers and AOF buffer from the
     * count of used memory. */
    mem_used = zmalloc_used_memory();
    if (slaves) {
        listIter li;
        listNode *ln;
        listRewind(server.slaves,&li);
        while((ln = listNext(&li))) {
            client *slave = listNodeValue(ln);
            unsigned long obuf_bytes = getClientOutputBufferMemoryUsage(slave);
            if (obuf_bytes > mem_used)
                mem_used = 0;
            else
                mem_used -= obuf_bytes;
        }
    }
    if (server.aof_state != AOF_OFF) {
        mem_used -= sdslen(server.aof_buf);
        mem_used -= aofRewriteBufferSize();
    }

这段代码暴露了作者的真实设计意图，maxmemory不是限制Redis最大可使用内存的，而是限制数据存储大小的，计算时减掉了一些buffer，但是在实现上存在一些比较严重的问题，下面我们详细的讨论这个问题。

使用者的疑惑

使用者的疑惑主要来自两个方面，在2.X系列版本中，很多使用者误以为maxmemory就是Redis可以使用的最大内存，maxmemory配置不合理导致一系列故障，如OOM，从库同步不成功等。
在3.X系列中，这个问题通过info命令彻底暴露给使用者了，为什么used_memory比maxmemory多，这两者的关系是什么，maxmemory到底是指什么，应该设置多大比较合理，作者并没有给出详细的解释也没有很好的文档说明。这就导致了使用者产生了众多疑惑。
在info命令的实现中，used_memory直接获取了Redis进程占用的内存，如下所示，就这段代码而言，used_memory大于maxmemory是必然的，并且出现了前后设计相违背的情况，比如info命令里面的lua memory也没有减去。

/* Memory */
if (allsections || defsections || !strcasecmp(section,"memory")) {
    char hmem[64];
    char peak_hmem[64];
    char total_system_hmem[64];
    char used_memory_lua_hmem[64];
    char used_memory_rss_hmem[64];
    char maxmemory_hmem[64];
    size_t zmalloc_used = zmalloc_used_memory();
    size_t total_system_mem = server.system_memory_size;
    const char *evict_policy = evictPolicyToString();
    long long memory_lua = (long long)lua_gc(server.lua,LUA_GCCOUNT,0)*1024;
    /* Peak memory is updated from time to time by serverCron() so it
     * may happen that the instantaneous value is slightly bigger than
     * the peak value. This may confuse users, so we update the peak
     * if found smaller than the current memory usage. */
    if (zmalloc_used > server.stat_peak_memory)
        server.stat_peak_memory = zmalloc_used;
    bytesToHuman(hmem,zmalloc_used);
    bytesToHuman(peak_hmem,server.stat_peak_memory);
    bytesToHuman(total_system_hmem,total_system_mem);
    bytesToHuman(used_memory_lua_hmem,memory_lua);
    bytesToHuman(used_memory_rss_hmem,server.resident_set_size);
    bytesToHuman(maxmemory_hmem,server.maxmemory);

还遗漏了什么

mem_used - obuf_bytes - aofRewriteBufferSize()是否就等于最大可以存储的数据大小呢，答案是否定的。就Redis中的buffer而言，除了主从同步的buffer、aofRewriteBuffer外还有其他的buffer，在Redis的配置中还有
client-output-buffer-limit这个参数，在Reids中所有的client请求redis数据的时候，redis要返回给client的数据都会先被存储在output-buffer中，等所有信息都被传送完毕之后，再清除output-buffer中的数据，这个参数可以可以解读为三段

normal，常规的client缓存返回结果的buffer
slave，slave节点同步数据的buffer
pubsub，pubsub时产生的buffer
其中每一段可以设置一个硬限，一个软限，一个超时时间，作者在上面的实现时，从mem_used减掉了slave节点的buffer大，但是并没有减掉normal和pubsub占用的buffer，这是作者遗漏的第一点。
在Redis2.8版本之前，没有增量复制功能，如果出现主从同步中断，则只能全量同步，在2.8版本之后，增加了部分复制的功能，为此引进了一个新的参数repl-backlog-size,repl-backlog-size是一个环形缓冲区，整个master进程中只会存在一个，所有的slave公用，主从同步时，不仅将
命令发送到slave，同时会计入repl-backlog-size，当某个slave断开重连时，使用psync将repl-backlog-size的内容发送给slave，实现增量复制，但是由于环形缓冲区是环形缓冲区，所以写满后会覆盖之前的部分，这个时候从节点断开时只能全量复制了。所以这个参数
在增量复制时是关键的参数，这部分在主节点上也会占用内存，这是作者遗漏的第二点。
第三点是RDB COW时占用的内存，代码如下,需要注意的是，Redis的bgsave和aofRewrite是通过后台线程来实现的，RDB过程中占用的内存页应该计算到Redis占用的内存中去。

 if (retval == C_OK) {
           size_t private_dirty = zmalloc_get_private_dirty();
           if (private_dirty) {
               serverLog(LL_NOTICE,
                   "RDB: %zu MB of memory used by copy-on-write",
                   private_dirty/(1024*1024));
           }
        ……
}

当然，上面这三点并不是遗漏的全部，还有诸如monitors，lua等，但是由于Redis是用减法的形式获得的mem_used，所以并不会特别精确。

如何改善

如果按照Redis作者的意图，那么maxmemory的判断至少应该减掉上面所提的三个点占用的内存，在info命令打印的内存信息中，增加data_used_memory,避免使用者的疑惑。
如果想用maxmemory控制Redis进程占用的最大内存大小，那么应该在freeMemoryIfNeeded函数中不进行减法，直接判断maxmemory，这样就可以限定最大使用内存上限。

总结

实际使用中需要理解已下几点：

maxmemory并不是Redis最大使用的内存上限
Redis最大使用的内存上限应该考虑数据存储+各种buffer+QPS+从节点数量等信息。