3 The main idea is to catch all virtual memory allocations per process (per
4 user). That brings info about each user memory allocations, exactly all virtual
5 memory pages that can be swapped.
9 The address bus in Intel CPU is 32bit, so we can address up to 4GB memory.
10 Therefore almost everytime we have less physical memory(RAM) than virtualy can
11 to address. From that point of view, all processes in Linux can allocate up to
12 3GB of virtual address space (1GB used by kernel). In kernel 2.4.x 128M (from
13 kernels 1GB) used for kernel pagging,so if machine have 1.2GB of RAM the tasks
14 address space will allocate virtual pages from 3GB and kernel from 1GB where
15 128M will reserved for pagging of 200MB from RAM. Memory allocation procedure
16 devided in 2 parts: allocation of virtual memory pages and mapping those pages
18 2. Virtual memory allocation
19 When process ask for space, system allocates it from process virtual address
20 space (which 3GB per process). Allocated space described by virtual memory area
21 (VMA) structure (look struct vm_area_struct). All proccess VMAs organized in
22 linked list which connected to memory management struct (look struct mm_struct)
23 of the process. When process wants to get info from virtual address at first
24 time, occur mapping of page to frame (physical page). How to find frame?
26 In order to address in 64bit architecture, linux uses 3-level pagging model.
27 The virtual address devides in 4 parts : global,middle,table and offset. First
28 table (Page Global Directory, 1024 entries) can be found by address stored in
29 processor register cr3. Then global part(10bit) of virtual address is entry
30 index in PGD, where stored address to the second table (Page Middle Directory,
31 1024 entries). The middle part(??bit) of virtual address gives entry index in
32 PMD, where stored address of Page Table. Now table part of virtual address gives
33 index in PT, where page address stored. Together w/ offset part of virtual
34 address can get to exactly address inside frame. On 32bit architecture, the
35 virtual address devided in 3 parts: global(10bit),table(10bit) and offset
36 (12bit). Linux creates PMD w/ 1 entry,so the same code can works for 32 and 64
37 bit. Cause global part size is 10bit ,so it describes 2^10 = 1024 entries in
38 PGD,the same w/ PT. The offset is 12bit therefore 2^12 = 4096, cause max size
39 of frame can be up to 4096.
40 ______________________________________________________________________
42 | Global dir | Middle dir | Table | Offset |
43 |_________________|_________________|_________________|________________|
49 | | PMD | | | + -->|XXXXX|
50 | | ______ | | | | | |
51 | PGD | | | + -->|XXXXXX|--------->|_____|
53 | | | + -->|XXXXXX|--------->|______|
55 + -->|XXXXXX|-------->|______|
59 Pic 1. Linux paging model
61 How to found physical frame address? There are number of macros help to recover
62 info pte from address. Like, pgd_offset() gives the PGD entry point and
63 pmd_offset() can get PMD entry. Now pmd_none() checks if corresponding entry=0
64 and pte_none() does the same. Some more macros below (look paging).
66 Each process described in kernel by process descriptor ot task_struct (look
67 struct task_struct) type. Each execution context that can be independently
68 scheduled must have its own process descriptor. The address of process descr
69 bring process descriptor pointer to be reference to process. The state field of
70 task_struct describes a possible process state, which are
71 TASK_RUNNING - process executing or waiting to be.
72 TASK_INTERRUPTIBLE - suspended (sleeping) and could be wait for condition
73 TASK_UNINTERRUPTIBLE - suspended and waites for certain condition or even to occur
74 TASK_STOPPED - stopped by signal
75 TASK_ZOMBIE - process execution is terminated but parent process noy yet issued
76 wait() like system call.
77 Kernel reserves global static array of size NR_TASKS (max number of handled
78 processes) called task (kernel/sched.c) in its own address space. The Processs
79 virtual memory traditionaly partitioned in follows memory areas:
80 CODE (or TEXT) - segment for executable code;
81 DATA - segment contains the initialized data (static and global vars whose
82 initial values are stored in executable);
83 BSS - segment contains the uninitialized data (global vars whose initial values
84 are not stored in executable
85 STACK - initial program stack (User Mode stack)
86 HEAP - area where memory dynamically requested by program)
87 Another area for executable code and data of needed shared libraries.
89 Each segment, from virtual memory point of view id link list of VMAs (struct
91 Each process have almost 3GB memory address space where it
92 lives. The picture of process address space:
94 ___________________________________________________________________________
97 |___________________________________________________________________________|
99 Pic 2. Segments location in process address space
101 The Code segment starts from beginning (0x0000) and grow up. Each time new lib-
102 rary call new VMA would be allocated. The Data segment placed in the middle and
103 all dynamic allocations of process placed there, by creating new VMAs. STACK
104 segment always placed at the end, so it have end at 0xc0000000 and content only
105 1 VMA. So when more place needed that VMA changes vm_start (struct
106 vm_area_struct). That mean the Stack segment grows down (value of vm_start
109 Memory regions implemented by means of descriptors of type vm_area_struct.
110 Each mem region consists of a set of pages having consecutive page numbers.
111 There are mem region flags stored in vm_flags field of struct vm_area_struct.
112 Some of them offer kernel info about all pages of memory info (access rights,
113 contain).Others describe region itself.
114 VM_DENYWRITE - region maps a file that can't be open for writing
115 VM_EXEC - pages can be executed
116 VM_EXECUTABLE - pages contain executable code
117 VM_GROWSDOWN - region can be expand toward lower address
118 VM_GROWSUP - region can be expand toward higher address
119 VM_IO - region maps I/O address of the device
120 VM_LOCKED - pages locked and can't be swapped out
121 VM_MAYEXEC - VM_EXEC flag may be set
122 VM_MAYREAD - VM_READ flag may be set
123 VM_MAYSHARE - VM_SHARED flag may be set
124 VM_MAYWRITE - VM_WRITE flag may be set
125 VM_READ - pages can be read
126 VM_SHARED - pages ca be shared by several processes
127 VM_SHM - pages are used for IPC's shared mem
128 VM_WRITE - pages can be written
129 The initial values of the page table flags (which must be the same for all
130 pages in region) are stored in field vm_page_prot of vm_area_struct.
137 This type of address embodies segmentation arch. Each addr consists of a
138 segment and an offset that defines the distance from the start of the segment.
141 A single 32bit unsigned int that can be used to address up to 4GB.
144 Adressing memory cells. Also 32bit unsigned int.
145 ______________ ________
147 || Logical addr>>>>| Segmentation || Linear addr>>>>| Paging |||Physical addr
148 |______________| |________|
150 Pic. 1 Logical address translation
152 2. Segmentation in hardware
154 Segmentation registers.
155 Logical addr consists of 2 parts: segment ident and offset. Segment ident
156 is 16bit field called segment selector. Offset is 32bit field. There are six
157 segment registers to hold segment selectors: cs,ss,ds,es,fs,gs. Three of them:
158 cs - code segment register
159 ss - stack segment register
160 ds - data segment register, points to segment where static and external data
163 Each segment is represented ny 8B segment descriptor. Segment desc stored
164 in Global Descriptor Table (GDT) or Local Descriptor
165 63 | |31 Table(LDT). The addr of GDT is contained in gdtr
166 62 | |30 processor register and of LDT in ldtr register. Each
167 61 | |29 segment descriptor consists of following fields:
168 60 BASE | |28 * 32bit base field - linear addr of 1st byte
169 59 (24-31) | |27 * G - granularity flag. In on => segment size is ex-
170 58 | |26 pressed in bytes, otherwise in multiples of 4096 B
171 57 | |25 * 20bit limit field - segment length in Bs. If G=0 so
172 56_________| BASE |24 size 1B-1MB, otherwise if G=1 from 4KB to 4MB
173 55___G_____| (0-15) |23 * S - system flag.
174 54___B_____| |22 * 4bit TYPE
180 48_________|________|16
188 40_________| LIMIT |8