confusing about PSS in /proc/pid/maps

Question

I found one great explanation about smaps from info about smaps

To my understanding, I thought that

shared_clean + shared_dirty + private_clean + private_dirty = rss

I wrote a program to verify it:

void sa();
int main(int argc,char *argv[])
{
    sa();
    sleep(1000);
}

void sa()
{
   char *pi=new char[1024*1024*10];
   for(int i=0;i<4;++i) {   //dirty should be 4M
        for(int j=0;j<1024*1024;++j){
                *pi='o';
                pi++;
        }
   }
   int cnt;
   for(int i=0;i<6;++i) {   //clean should be 6M
        for(int j=0;j<1024*1024;++j){
                cnt+=*pi;
                pi++;
        }
   }
   printf("%d",cnt);
}

But to my surprise,the /proc/pid/smaps is:

09b69000-09b8c000 rw-p 00000000 00:00 0 [heap]
...
Size:           10252 kB
Rss:            10252 kB
Pss:             4108 kB //<----I thought it should be 10M
Shared_Clean:       0 kB
Shared_Dirty:       0 kB
Private_Clean:      0 kB //<----I thought it should be 6M
Private_Dirty:   4108 kB
Referenced:      4108 kB
Swap:               0 kB
KernelPageSize:     4 kB
MMUPageSize:        4 kB

Anything wrong with my understanding?

according to answer of Mat,

The pages in the 6M you're only reading can't really be considered clean. A clean page is one that is synchronized with its backing store (whatever that is, swap, a file, etc.).

.

I rewrite the codes using mmap,this time its result is as expected :)

create a dummy file first:

time dd if=/dev/zero of=test.bin bs=30000000 count=1

new code:

void sa(char *pi)
{
   for(int i=0;i<4;++i) {
        for(int j=0;j<1024*1024;++j){
                *pi='a';
                pi++;
        }
   }
   //just to use it to avoid the following code will not optimized off by the compiler 
   int dummycnt=0;
   for(int i=0;i<6;++i) {
        for(int j=0;j<1024*1024;++j){
                dummycnt+=*pi;
                pi++;
        }
   }
   printf("%d",dummycnt);
}


int main()
{
       int fd  = open("./test.bin",O_RDWR);
       char *p = (char *)mmap(0,
                      1024*1024*10, //10M
                      PROT_READ|PROT_WRITE,
                      MAP_SHARED,
                      fd,
                      0);
       sa(p);
       sleep(10000);
}

cat /proc/pid/smaps:

b6eae000-b78ae000 rw-s 00000000 08:05 134424     ..../test.bin
Size:              10240 kB
Rss:               10240 kB
Pss:               10240 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:      6144 kB
Private_Dirty:      4096 kB
Referenced:        10240 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB

score 3 · Accepted Answer · answered Mar 31 '12 at 08:24

First off, your code exhibits undefined behavior (cnt is used without having been initialized, same for the top 6M you're reading without initializing), so make sure your compiler actually outputs instructions that match your code: it doesn't have to. (I'm assuming you've checked that.)

The pages in the 6M you're only reading can't really be considered clean. A clean page is one that is synchronized with its backing store (whatever that is, swap, a file, etc.). Those pages don't have anything that backs them.
They're not really dirty either in the usual sense - after all, they haven't been modified.

So what's happening here? All the pages in the 6M block you're only reading are mapped to the same page, and that page is the "zero page" (i.e. a shared (on x86 at least) page that contains 4k zero bytes).

When the kernel gets a page fault on an unmapped anonymous page, and that fault is a read, it maps in a zero page (the same page each time). (This is in do_anonymous_page in mm/memory.c)
This is not a "normal" mapping (in the vm_normal_page sense), and does not get accounted in the smaps fields as shared or private anything (smaps_pte_entry in fs/proc/task_mmu.c skips "special" pages entirely). It does get accounted in RSS and Size though: from an address space perspective, these virtual pages exist and have been "used".
If you start modifying (writing to) any page in that area, it will get a proper, normal mapping with an anonymous page (zero-initialized in this specific case, interestingly - it won't be zero-initialized if the previous (non-normal/fake) mapping wasn't to the zero page). (See do_wp_page in mm/memory.c.) At that point you'll see smaps display what you expect.

Please do note that nothing in either C, POSIX or anything else guarantees these pages to contain zeros, you can't rely on that. (You can't actually rely on that on Linux either - that's how it is implemented right now, but it could conceivably change.)

confusing about PSS in /proc/pid/maps

1 Answers1