Understanding memory usage on Linux

Prev

http://virtualthreads.blogspot.com/2006/02/understanding-memory-usage-on-linux.html

February 04, 2006 by Devin

This entry is for those people who have ever wondered, "Why the hell is a simple KDE text editor taking up 25 megabytes of memory?" Many people are led to believe that many Linux applications, especially KDE or Gnome programs, are "bloated" based solely upon what tools like ps report. While this may or may not be true, depending on the program, it is not generally true — many programs are much more memory efficient than they seem.

What ps reports

The ps tool can output various pieces of information about a process, such as its process id, current running state, and resource utilization. Two of the possible outputs are VSZ and RSS, which stand for "virtual set size" and "resident set size", which are commonly used by geeks around the world to see how much memory processes are taking up.

For example, here is the output of ps aux for KEdit on my computer:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
dbunker   3468  0.0  2.7  25400 14452 ?        S    20:19   0:00 kdeinit: kedit

According to ps, KEdit has a virtual size of about 25 megabytes and a resident size of about 14 megabytes (both numbers above are reported in kilobytes). It seems that most people like to randomly choose to accept one number or the other as representing the real memory usage of a process. I'm not going to explain the difference between VSZ and RSS right now but, needless to say, this is the wrong approach; neither number is an accurate picture of what the memory cost of running KEdit is.

Why ps is "wrong"

Depending on how you look at it, ps is not reporting the real memory usage of processes. What it is really doing is showing how much real memory each process would take up if it were the only process running. Of course, a typical Linux machine has several dozen processes running at any given time, which means that the VSZ and RSS numbers reported by ps are almost definitely "wrong". In order to understand why, it is necessary to learn how Linux handles shared libraries in programs.

Most major programs on Linux use shared libraries to facilitate certain functionality. For example, a KDE text editing program will use several KDE shared libraries (to allow for interaction with other KDE components), several X libraries (to allow it to display images and copy and pasting), and several general system libraries (to allow it to perform basic operations). Many of these shared libraries, especially commonly used ones like libc, are used by many of the programs running on a Linux system. Due to this sharing, Linux is able to use a great trick: it will load a single copy of the shared libraries into memory and use that one copy for every program that references it.

For better or worse, many tools don't care very much about this very common trick; they simply report how much memory a process uses, regardless of whether that memory is shared with other processes as well. Two programs could therefore use a large shared library and yet have its size count towards both of their memory usage totals; the library is being double-counted, which can be very misleading if you don't know what is going on.

Unfortunately, a perfect representation of process memory usage isn't easy to obtain. Not only do you need to understand how the system really works, but you need to decide how you want to deal with some hard questions. Should a shared library that is only needed for one process be counted in that process's memory usage? If a shared library is used my multiple processes, should its memory usage be evenly distributed among the different processes, or just ignored? There isn't a hard and fast rule here; you might have different answers depending on the situation you're facing. It's easy to see why ps doesn't try harder to report "correct" memory usage totals, given the ambiguity.

Seeing a process's memory map

Enough talk; let's see what the situation is with that "huge" KEdit process. To see what KEdit's memory looks like, we'll use the pmap program (with the -d flag):

Address   Kbytes Mode  Offset           Device    Mapping
08048000      40 r-x-- 0000000000000000 0fe:00000 kdeinit
08052000       4 rw--- 0000000000009000 0fe:00000 kdeinit
08053000    1164 rw--- 0000000008053000 000:00000   [ anon ]
40000000      84 r-x-- 0000000000000000 0fe:00000 ld-2.3.5.so
40015000       8 rw--- 0000000000014000 0fe:00000 ld-2.3.5.so
40017000       4 rw--- 0000000040017000 000:00000   [ anon ]
40018000       4 r-x-- 0000000000000000 0fe:00000 kedit.so
40019000       4 rw--- 0000000000000000 0fe:00000 kedit.so
40027000     252 r-x-- 0000000000000000 0fe:00000 libkparts.so.2.1.0
40066000      20 rw--- 000000000003e000 0fe:00000 libkparts.so.2.1.0
4006b000    3108 r-x-- 0000000000000000 0fe:00000 libkio.so.4.2.0
40374000     116 rw--- 0000000000309000 0fe:00000 libkio.so.4.2.0
40391000       8 rw--- 0000000040391000 000:00000   [ anon ]
40393000    2644 r-x-- 0000000000000000 0fe:00000 libkdeui.so.4.2.0
40628000     164 rw--- 0000000000295000 0fe:00000 libkdeui.so.4.2.0
40651000       4 rw--- 0000000040651000 000:00000   [ anon ]
40652000     100 r-x-- 0000000000000000 0fe:00000 libkdesu.so.4.2.0
4066b000       4 rw--- 0000000000019000 0fe:00000 libkdesu.so.4.2.0
4066c000      68 r-x-- 0000000000000000 0fe:00000 libkwalletclient.so.1.0.0
4067d000       4 rw--- 0000000000011000 0fe:00000 libkwalletclient.so.1.0.0
4067e000       4 rw--- 000000004067e000 000:00000   [ anon ]
4067f000    2148 r-x-- 0000000000000000 0fe:00000 libkdecore.so.4.2.0
40898000      64 rw--- 0000000000219000 0fe:00000 libkdecore.so.4.2.0
408a8000       8 rw--- 00000000408a8000 000:00000   [ anon ]
... (trimmed) ...
mapped: 25404K    writeable/private: 2432K    shared: 0K

I cut out a lot of the output; the rest is similar to what is shown. Even without the complete output, we can see some very interesting things. One important thing to note about the output is that each shared library is listed twice; once for its code segment and once for its data segment. The code segments have a mode of "r-x—", while the data is set to "rw---". The Kbytes, Mode, and Mapping columns are the only ones we will care about, as the rest are unimportant to the discussion.

If you go through the output, you will find that the lines with the largest Kbytes number are usually the code segments of the included shared libraries (the ones that start with "lib" are the shared libraries). What is great about that is that they are the ones that can be shared between processes. If you factor out all of the parts that are shared between processes, you end up with the "writeable/private" total, which is shown at the bottom of the output. This is what can be considered the incremental cost of this process, factoring out the shared libraries. Therefore, the cost to run this instance of KEdit (assuming that all of the shared libraries were already loaded) is around 2 megabytes. That is quite a different story from the 14 or 25 megabytes that ps reported.

What does it all mean?

The moral of this story is that process memory usage on Linux is a complex matter; you can't just run ps and know what is going on. This is especially true when you deal with programs that create a lot of identical children processes, like Apache. ps might report that each Apache process uses 10 megabytes of memory, when the reality might be that the marginal cost of each Apache process is 1 megabyte of memory. This information becomes critial when tuning Apache's MaxClients setting, which determines how many simultaneous requests your server can handle (although see one of my past postings for another way of increasing Apache's performance).

It also shows that it pays to stick with one desktop's software as much as possible. If you run KDE for your desktop, but mostly use Gnome applications, then you are paying a large price for a lot of redundant (but different) shared libraries. By sticking to just KDE or just Gnome apps as much as possible, you reduce your overall memory usage due to the reduced marginal memory cost of running new KDE or Gnome applications, which allows Linux to use more memory for other interesting things (like the file cache, which speeds up file accesses immensely).

Comments

This completely ignores the fact that each of those shared libraries takes up a number of virtual to physical page mappings. These, unlike memory, are a very precious resource. Just because people tend to ignore this, doesn't mean that shared libraries don't have a high price. Try running something like KDE on non x86 hardware that doesn't have huge translation tables. On mips hardware, I've seen apps spend over 30% of their time updating tlb entries.

Comments

if you're getting hammered by TLB misses, try a larger page size, if possible.

Comments

You guys seem to forget that every program uses text and data segments. In the very simple example text is the program code, which is shared, in a well designed OS like Linux, among all instances of a given process. The data like the stack or localy assigned memory (malloced) is different for every process. It's very important to not forget about the data as badly written editor (pick any GUI editor) might allocate huge ammount of memory just to open a simple logfile which just happens to be 500MB and this is just plain wrong!

Comments

The ability to use a single physical copy of a shared library isn't a great Linux trick - it's a common feature of just about every OS that uses shared libraries. Btw, the same principal applied to executables as well. For instance, if you have several xterms's running, you will only have one copy of the xterm executable program text in memory, which will be shared between all instances (obviously, each instance will have its own stack and heaps).

Comments

> 'ps' is a bit crap then? can anyone recommend any other programs that read
> 'actual' memory usage?

Why bother with pmap? 'ps' gives basically the same answer, if you're not addicted to the BSD format. Look for the definition of the 'SZ' column in the man page. It's overly technical, but it boils down to the writable/private data that pmap gave, and you can see the whole list at once. For example

ps -lyu <user>

Comments

What about using lsof for determining sizes? It will list libraries separate and you get a calculated size of the components using memory/VFS space.

Comments

> I was hoping to see some explanation of what RAM is used by the running
> processes and what is used for file system cache.

Another way to do that is with the command "free" — you'll want the second line, which sticks the buffers and cache in the available column.

Mantar

Comments

In additional to "free", you can also just "cat /proc/meminfo". Of course, that level of detail is not for the faint of heart!

And if even that isn't enough, take a look at /proc/slabinfo, where the amount of memory for each subsystem inside the kernel is recorded! It'll tell you how may inodes are allocated (and in-use) for each filesystem type (but not for each specific file system mount point).

As an instructor who teaches Linux Internals, I think the article is a good overview and starting point for understanding Linux memory allocation in applications.

The comments concerning the Translation Lookaside Buffer (TLB) are a bit disingenuous (the TLB caches 32-bit logcal address translations into 32-bit (or 36-bit) physical addresses on x86 hardware, so if a vm_area in one process refers to the same pages as a vm_area in another process, the TLB entries can be shared; however, Linux doesn't currently try to track this and just clears the entire TLB on process context switches — well, at least on the 2.6.9 kernel that I last looked at to explore this topic), and concerning the use of -FPIC: all standard libraries that I know of are compiled using the "position-independent code" options.

One individual mentioned management of page tables; but page tables are always a constant size (based on the number of frames of memory available minus those frames used to hold the page tables themselves). At least, a constant size when the page is a given size (typically 4K on x86, although there is a compile-time option for 8K page sizes). I'm not familiar with the impact of HugeTLB pages, as I haven't studied them in detail and I wouldn't want to knowingly spread misinformation. (And again, most of this info goes back to 2.6.9, but some is more recent.)

I'm glad someone mentioned that Linux (and other modern Un*ces) use an mmaped scheme for heap space; most people don't know that. :) What's really weird is the sequence of address space ranges that are allocated to contain the heap! ;) The libc malloc() implementation jumps all over the place within the process's address space and maps /dev/zero in order to allocate heap space. And of course, allocating address space in a process doesn't require any physical memory to necessarily exist, so a "writable/private" of 20MB might have as little as 100KB of actually allocated virtual-to-physical memory pages. I'm not talking about paged out memory, but "ever allocated" memory.

Anyway, if you want more information, pick up a copy of Robert Love's book on Linux Kernel Development, or the download the Gorman book, Understanding the Linux Virtual Memory Manager (the PDF is available elsewhere; google for it).

Comments

I've a script to accurately list programs' ram usage here: http://www.pixelbeat.org/scripts/ps_mem.py

You can't assume all mem associated with a shared lib is shared. Also pmap reports just virtual mem large parts of which are likely to be paged to disk forever for shared libs.

My script uses the more accurate /proc/…/smaps available in newer kernels to determine the mem used for a process. Compare and contrast:

$ pmap $$ | grep libc
00a13000 1168K r-x-- /lib/libc-2.3.5.so
00b37000 8K r-x-- /lib/libc-2.3.5.so
00b39000 8K rwx-- /lib/libc-2.3.5.so

$ grep libc -A6 /proc/$$/smaps
00a13000-00b37000 r-xp 00000000 08:06 513762 /lib/libc-2.3.5.so
Size: 1168 kB
Rss: 492 kB
Shared_Clean: 492 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
00b37000-00b39000 r-xp 00124000 08:06 513762 /lib/libc-2.3.5.so
Size: 8 kB
Rss: 8 kB
Shared_Clean: 4 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
00b39000-00b3b000 rwxp 00126000 08:06 513762 /lib/libc-2.3.5.so
Size: 8 kB
Rss: 8 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 8 kB

Note: not accurate, some at least. No way bash is more than 2 times bigger than fluxbox:

$ ps -lyu tong
S   UID   PID  PPID  C PRI  NI   RSS    SZ WCHAN  TTY          TIME CMD
S  9999  2892  2868  0  76   0  2328   923 wait   tty1     00:00:00 bash
S  9999  3382  2869  0  75   0  2280   919 -      tty2     00:00:00 bash
S  9999  9999  2892  0  79   0  1432   747 wait   tty1     00:00:00 startx
S  9999 10017  9999  0  76   0   672   603 wait   tty1     00:00:00 xinit
S  9999 10050 10017  0  75   0  4836  2310 -      tty1     00:00:01 fluxbox
S  9999 10121 10050  0  76   0   720  1119 429496 ?        00:00:00 ssh-agent
S  9999 10122 10050  0  76   0   720  1118 429496 ?        00:00:00 ssh-agent
S  9999 10125     1  0  76   0   640   658 429496 tty1     00:00:00 dbus-launch
S  9999 10126     1  0  85   0   436   510 1      ?        00:00:00 dbus-daemon
S  9999 10128 10050  0  75   0  3048  1549 1      tty1     00:00:00 xterm
S  9999 10129 10128  0  75   0  2312   922 wait   pts/0    00:00:00 bash
S  9999 10137 10050  0  75   0  9048  7008 3      tty1     00:00:00 gnome-settin

% ps_mem.py
436.0K dbus-daemon
640.0K dbus-launch
672.0K xinit
992.0K ssh-agent
  1.4M startx
  4.7M fluxbox
  5.8M xterm
  8.8M gnome-settings-
  9.3M bash

Comments

> If you factor out all of the parts that are shared between processes, you
> end up with the "writeable/private" total, which is shown at the bottom of
> the output. This is what can be considered the incremental cost of this
> process, factoring out the shared libraries.

I wrote a tool called Exmap which attempts to address this. i.e. you don't have to factor out the shared usage. http://www.berthels.co.uk/exmap

It uses a loadable kernel module to work out which pages are actually shared between processes. This should accurately account for demand-paging, copy-on-write and all that good stuff.

If 3 processes all have a page mapped, that page accounts (PAGE_SIZE/3) to each process. It gives figures on various things, including RAM used, (RAM+swap) used, writable, etc.

It also allows you to break things down by proc/file/ELF section/ELF symbol.

It's a bit of a 'raw' tool, but should be usable by anyone who understands the issues you describe in your post :-)

John Berthels

Comments

The comment about fragmentation of the heap is a bit off-base. First, the Linux allocator manages large objects with mmap, which can free memory chunks no matter where they are in the heap. Second, other memory allocators (BSD's, Hoard, Vam) use mmap exclusively, so they can always free up empty pages. Third, garbage collection invariably requires more space, and always swaps far more than malloc. Why? Because it periodically has to touch EVERY PAGE - including those swapped to disk - while it looks for garbage. For more info about GC and swapping, read our "Garbage Collection without Paging" or "Automatic Heap Sizing" papers; for more about page-oriented allocators, read one of our papers about Hoard or Vam (all linked from my web page).

http://www.cs.umass.edu/~emery

Comments

Very informative. But the two apps which most often are charged of bloat: firefox and openoffice, don't fare so well in this respect:

Openoffice: mapped: 200540K writeable/private: 80480K shared: 34628K Firefox: mapped: 128024K writeable/private: 95096K shared: 576K

I guess that's the price for cross-platform compatibility…

Comments

> Firefox needs lots of memory because it needs to cache in memory a rendered
> copy of every tabs (or at least enough information to re-render
> it). Pictures are big, expecially once they're decompressed, that 20K JPEG
> might be 2M once converted back to 24-bit bitmap.

Another problem with Firefox on startup is that, using GNOME libraries, it reads an almost ridiculous amount of gnome xml config files, over 700 IIRC.

I can't always agree to the "features != bloat" argument. Who needs a flashing icon at the bottom right edge of the mouse pointer in KDE when starting apps, to give one example.

Well, anyway, here's a comparison of our most beloved editors (name private/mapped/number of libs)

joe 728K/2624K/24
vim 836K/3804K/37
mcedit 1292K/5884K/52
emacs 3912K/11472K/82
kate 1392K/24676K/298

And despite libs being shared (note that they also need -fpic / -fPIC during compilation, otherwise it's for nothing!), considering just one process, should it run by itself (kate for example starts a handful of KDE services if not already started), it loads the most libraries. Though not all of these might be immediately mapped into memory, when you use all 'features' of it, it uses the most.

Comments

> It's a slippery slope to say that 99% of users don't use some particular
> configuration. They used to say something similar about computer users in
> general - 99% don't use Linux.

Well now that's a lie now isn't it. Several years ago Linux passed MAC in desktop use —about 5%. So saying 99% don't use Linux is a lie by at least 4%, likely more. As for servers, I can argue more than 5% use Linux —by a factor of about 5x—. For supercomputer use, I can wander over to top500.org and the the numbers broken down by OS: Linux runs on 371 of the 500 fastest machines in the world, including all in the top 5. Windows has 1 machine (ranked in 310th spot). It's a machine sponsored by Microsoft so that they can say they have a machine on the list. Where did you get 99% from? Certainly not from those who know about comptuers and performance, not by a danmed longshot!