Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Page Management: The Keeper of Rooms

A city of machines keeps its citizens in rooms. Some rooms are occupied, some are idle but still furnished, and some are entirely empty. The keeper is not a poet; she is a clerk with ledgers. Her work is to find a room quickly, remember who it belongs to, and decide when a room should be cleaned and returned to the pool.

SVR4’s page management system is that keeper. It tracks each physical page with a page_t record, ties pages to vnode identities, and manages two free lists that distinguish between empty rooms and rooms that still remember their last occupant.


The Ledger Entry: struct page

Every physical page has a struct page record that carries its identity, state, and list links (vm/page.h:57-101).

typedef struct page {
	u_int	p_lock: 1,
		p_want: 1,
		p_free: 1,
		p_intrans: 1,
		p_gone: 1,
		p_mod: 1,
		p_ref: 1,
		p_pagein: 1,
		p_nc: 1,
		p_age: 1;
	u_int	p_nio : 6;
	u_short	p_keepcnt;
	struct	vnode *p_vnode;
	u_int	p_offset;
	struct page *p_hash;
	struct page *p_next;
	struct page *p_prev;
	struct page *p_vpnext;
	struct page *p_vpprev;
	struct phat p_hat;
	u_short	p_lckcnt;
	u_short	p_cowcnt;
} page_t;

The Room Ledger (vm/page.h:57-89)

The fields mirror the keeper’s daily questions. Is the room free (p_free)? Is a move underway (p_intrans)? Does it remember its owner (p_vnode and p_offset)? The list pointers keep the page on the free lists or on a vnode’s page ring. The p_hat mapping field, shared with the HAT layer, is the keeper’s seal for whether the room is currently mapped (vm/page.h:77-87).

The ledger is also aware of noncontiguous physical memory. pageac describes each contiguous span of page frames so the allocator can search disjoint segments when a request needs physical contiguity (vm/page.h:113-125).


Page Management - Farmer’s Field Page Management - Farmer’s Field

The Two Free Lists: Empty Rooms and Remembered Rooms

page_free() is the clerk’s act of returning a room to the pool. SVR4 does not use a single list; it keeps two circular lists: page_freelist for pages with no vnode identity and page_cachelist for pages that still remember their former contents (vm/vm_page.c:764-888).

/* Put page on the "free" list.  The free list is really two circular lists
 * with page_freelist and page_cachelist pointers into the middle of the lists.
 */
void
page_free(pp, dontneed)
	register page_t *pp;
	int dontneed;
{
	...
	if (vp == NULL) {
		/* page has no identity, put it on the front of the free list */
		page_add(&page_freelist, pp);
	} else {
		page_add(&page_cachelist, pp);
		if (!dontneed || nopageage)
			page_cachelist = page_cachelist->p_next;
	}
	...
}

The Two Pools (vm/vm_page.c:764-879, abridged)

A page with no identity goes straight to the front of page_freelist. A page with a vnode identity is cached on page_cachelist, and by default is pushed toward the tail to age out last. This is the keeper’s compromise: keep the most recently used rooms warm in case the same tenant returns.

Page Allocation Flow Figure 2.3.1: Free and Cache Lists in the Allocator


Naming and Finding Rooms: Hash Chains and Vnode Rings

The page hash is the index card catalog for the keeper. page_hashin() adds a page to the hash table and to the vnode’s circular list (vm/vm_page.c:1759-1804). page_find() and page_lookup() search by <vnode, offset>, reclaiming a page from the cache list if necessary (vm/vm_page.c:550-651).

page_t *
page_find(vp, off)
	register struct vnode *vp;
	register u_int off;
{
	for (pp = page_hash[PAGE_HASHFUNC(vp, off)]; pp; pp = pp->p_hash)
		if (pp->p_vnode == vp && pp->p_offset == off && pp->p_gone == 0)
			break;
	if (pp != NULL && pp->p_free)
		page_reclaim(pp);
	return (pp);
}

The Catalog Lookup (vm/vm_page.c:550-590, abridged)

page_lookup() adds the patience of a clerk who waits for a room in transit. If a page is being paged in, it waits on the page’s condition before reclaiming it (vm/vm_page.c:614-649). The key idea is that identity wins: if the vnode/offset matches, we do not manufacture a new page, we reclaim the old one.

Page Identity Chains Figure 2.3.2: Hash Table and Vnode Page Rings


Allocation: Asking for Rooms

page_get() is the official request for physical pages. It enforces resource limits, optionally sleeps, and prods the pageout daemon when memory is tight (vm/vm_page.c:1050-1154). It also handles the special case of physically contiguous allocation, walking the pageac table to find a run of free frames (vm/vm_page.c:1159-1188).

page_t *
page_get(bytes, flags)
	u_int bytes;
	u_int flags;
{
	npages = btopr(bytes);
	reqfree = (flags & P_NORESOURCELIM) ? npages : npages + minpagefree;
	while (freemem < reqfree) {
		if (!(flags & P_CANWAIT))
			return (NULL);
		if (bclnlist != NULL)
			cleanup();
		outofmem();
		(void) sleep((caddr_t)&freemem, PSWP+2);
	}
	...
}

The Allocation Request (vm/vm_page.c:1050-1154, abridged)

The flags tell the keeper how urgent the request is: P_CANWAIT permits sleeping, while P_PHYSCONTIG demands a contiguous stretch for DMA or device mappings. The allocator balances fairness (resource limits) with liveness (invoking outofmem() to drive pageout).


The Quiet Cleanup: page_abort() and Reclamation

When a page loses its identity, page_abort() clears mappings, marks the page gone, and drops it back to page_free() (vm/vm_page.c:707-761). This path is the keeper’s reset: tear down ownership, flush mappings, and return the room to the pool. If the page is still in transit, the cleanup waits for pvn_done() to finish the I/O before finally freeing it.


The Ghost of SVR4:

We kept one central ledger and two long queues. In your time, the keeper has been split into many clerks. Modern kernels maintain per-CPU page lists, per-node pools for NUMA, and memcg-aware LRU lists for containers. The policy is more complex, but the core idea has not changed: a page is a room with a name, and the kernel must always know which rooms are empty, which are cached, and which still belong to someone.


Closing the Ledger

Page management is the art of disciplined forgetting. The keeper must remember just enough to reuse the right rooms quickly, and forget enough to keep the city breathing. SVR4’s two-list allocator, its hash catalog, and its careful wait loops are the steady hand that keeps the rooms available.