i386/MIT/course/labs/lab3.txt

   1 6.828 Fall 2011 Lab 3: User Environments
   2
   3 Handed out Wednesday, September 28, 2011
   4 Part A due Thursday, October 6, 2011
   5 Part B due Thursday, October 13, 2011
   6
   7 Introduction
   8
   9 In this lab you will implement the basic kernel facilities required to get a
  10 protected user-mode environment (i.e., "process") running. You will enhance the
  11 JOS kernel to set up the data structures to keep track of user environments,
  12 create a single user environment, load a program image into it, and start it
  13 running. You will also make the JOS kernel capable of handling any system calls
  14 the user environment makes and handling any other exceptions it causes.
  15
  16 Note: In this lab, the terms environment and process are interchangeable - they
  17 have roughly the same meaning. We introduce the term "environment" instead of
  18 the traditional term "process" in order to stress the point that JOS
  19 environments do not provide the same semantics as UNIX processes, even though
  20 they are roughly comparable.
  21
  22 Getting Started
  23
  24 Use Git to commit your Lab 2 source, fetch the latest version of the course
  25 repository, and then create a local branch called lab3 based on our lab3
  26 branch, origin/lab3:
  27
  28 athena% cd ~/6.828/lab
  29 athena% add git
  30 athena% git commit -am 'my solution to lab2'
  31 Created commit 734fab7: my solution to lab2
  32  4 files changed, 42 insertions(+), 9 deletions(-)
  33 athena% git pull
  34 Already up-to-date.
  35 athena% git checkout -b lab3 origin/lab3
  36 Branch lab3 set up to track remote branch refs/remotes/origin/lab3.
  37 Switched to a new branch "lab3"
  38 athena% git merge lab2
  39 Merge made by recursive.
  40  kern/pmap.c |   42 +++++++++++++++++++
  41  1 files changed, 42 insertions(+), 0 deletions(-)
  42 athena%
  43
  44 Lab 3 contains a number of new source files, which you should browse:
  45
  46 inc/ env.h       Public definitions for user-mode environments
  47      trap.h      Public definitions for trap handling
  48      syscall.h   Public definitions for system calls from user environments to
  49                  the kernel
  50      lib.h       Public definitions for the user-mode support library
  51 kern env.h       Kernel-private definitions for user-mode environments
  52 /
  53      env.c       Kernel code implementing user-mode environments
  54      trap.h      Kernel-private trap handling definitions
  55      trap.c      Trap handling code
  56      trapentry.S Assembly-language trap handler entry-points
  57      syscall.h   Kernel-private definitions for system call handling
  58      syscall.c   System call implementation code
  59 lib/ Makefrag    Makefile fragment to build user-mode library, obj/lib/
  60                  libuser.a
  61      entry.S     Assembly-language entry-point for user environments
  62      libmain.c   User-mode library setup code called from entry.S
  63      syscall.c   User-mode system call stub functions
  64      console.c   User-mode implementations of putchar and getchar, providing
  65                  console I/O
  66      exit.c      User-mode implementation of exit
  67      panic.c     User-mode implementation of panic
  68 user *           Various test programs to check kernel lab 3 code
  69 /
  70
  71 In addition, a number of the source files we handed out for lab2 are modified
  72 in lab3. To see the differences, you can type:
  73
  74 $ git diff lab2 | more
  75
  76 You may also want to take another look at the lab tools guide, as it includes
  77 information on debugging user code that becomes relevant in this lab.
  78
  79 Lab Requirements
  80
  81 This lab is divided into two parts, A and B. Part A is due a week after this
  82 lab was assigned; you should make handin your lab before the Part A deadline,
  83 even though your code may not yet pass all of the grade script tests. (If it
  84 does, great!) You only need to have all the grade script tests passing by the
  85 Part B deadline at the end of the second week.
  86
  87 As in lab 2, you will need to do all of the regular exercises described in the
  88 lab and at least one challenge problem (for the entire lab, not for each part).
  89 Write up brief answers to the questions posed in the lab and a one or two
  90 paragraph description of what you did to solve your chosen challenge problem in
  91 a file called answers-lab3.txt in the top level of your lab directory. (If you
  92 implement more than one challenge problem, you only need to describe one of
  93 them in the write-up.)
  94
  95 Inline Assembly
  96
  97 In this lab you may find GCC's inline assembly language feature useful,
  98 although it is also possible to complete the lab without using it. At the very
  99 least, you will need to be able to understand the fragments of inline assembly
 100 language ("asm" statements) that already exist in the source code we gave you.
 101 You can find several sources of information on GCC inline assembly language on
 102 the class reference materials page.
 103
 104 Part A: User Environments and Exception Handling
 105
 106 The new include file inc/env.h contains basic definitions for user environments
 107 in JOS. Read it now. The kernel uses the Env data structure to keep track of
 108 each user environment. In this lab you will initially create just one
 109 environment, but you will need to design the JOS kernel to support multiple
 110 environments; lab 4 will take advantage of this feature by allowing a user
 111 environment to fork other environments.
 112
 113 As you can see in kern/env.c, the kernel maintains three main global variables
 114 pertaining to environments:
 115
 116 struct Env *envs = NULL;                // All environments
 117 struct Env *curenv = NULL;              // The current env
 118 static struct Env *env_free_list;       // Free environment list
 119
 120 Once JOS gets up and running, the envs pointer points to an array of Env
 121 structures representing all the environments in the system. In our design, the
 122 JOS kernel will support a maximum of NENV simultaneously active environments,
 123 although there will typically be far fewer running environments at any given
 124 time. (NENV is a constant #define'd in inc/env.h.) Once it is allocated, the
 125 envs array will contain a single instance of the Env data structure for each of
 126 the NENV possible environments.
 127
 128 The JOS kernel keeps all of the inactive Env structures on the env_free_list.
 129 This design allows easy allocation and deallocation of environments, as they
 130 merely have to be added to or removed from the free list.
 131
 132 The kernel uses the curenv variable to keep track of the currently executing
 133 environment at any given time. During boot up, before the first environment is
 134 run, curenv is initially set to NULL.
 135
 136 Environment State
 137
 138 The Env structure is defined in inc/env.h as follows (although more fields will
 139 be added in future labs):
 140
 141 struct Env {
 142         struct Trapframe env_tf;        // Saved registers
 143         struct Env *env_link;           // Next free Env
 144         envid_t env_id;                 // Unique environment identifier
 145         envid_t env_parent_id;          // env_id of this env's parent
 146         enum EnvType env_type;          // Indicates special system environments
 147         unsigned env_status;            // Status of the environment
 148         uint32_t env_runs;              // Number of times environment has run
 149
 150         // Address space
 151         pde_t *env_pgdir;               // Kernel virtual address of page dir
 152 };
 153
 154 Here's what the Env fields are for:
 155
 156 env_tf:
 157     This structure, defined in inc/trap.h, holds the saved register values for
 158     the environment while that environment is not running: i.e., when the
 159     kernel or a different environment is running. The kernel saves these when
 160     switching from user to kernel mode, so that the environment can later be
 161     resumed where it left off.
 162 env_link:
 163     This is a link to the next Env on the env_free_list. env_free_list points
 164     to the first free environment on the list.
 165 env_id:
 166     The kernel stores here a value that uniquely identifiers the environment
 167     currently using this Env structure (i.e., using this particular slot in the
 168     envs array). After a user environment terminates, the kernel may
 169     re-allocate the same Env structure to a different environment - but the new
 170     environment will have a different env_id from the old one even though the
 171     new environment is re-using the same slot in the envs array.
 172 env_parent_id:
 173     The kernel stores here the env_id of the environment that created this
 174     environment. In this way the environments can form a ``family tree,'' which
 175     will be useful for making security decisions about which environments are
 176     allowed to do what to whom.
 177 env_type:
 178     This is used to distinguish special environments. For most environments, it
 179     will be ENV_TYPE_USER. The idle environment is ENV_TYPE_IDLE and we'll
 180     introduce a few more special types for special system service environments
 181     in later labs.
 182 env_status:
 183     This variable holds one of the following values:
 184
 185     ENV_FREE:
 186         Indicates that the Env structure is inactive, and therefore on the
 187         env_free_list.
 188     ENV_RUNNABLE:
 189         Indicates that the Env structure represents an environment that is
 190         waiting to run on the processor.
 191     ENV_RUNNING:
 192         Indicates that the Env structure represents the currently running
 193         environment.
 194     ENV_NOT_RUNNABLE:
 195         Indicates that the Env structure represents a currently active
 196         environment, but it is not currently ready to run: for example, because
 197         it is waiting for an interprocess communication (IPC) from another
 198         environment.
 199
 200 env_pgdir:
 201     This variable holds the kernel virtual address of this environment's page
 202     directory.
 203
 204 Like a Unix process, a JOS environment couples the concepts of "thread" and
 205 "address space". The thread is defined primarily by the saved registers (the
 206 env_tf field), and the address space is defined by the page directory and page
 207 tables pointed to by env_pgdir. To run an environment, the kernel must set up
 208 the CPU with both the saved registers and the appropriate address space.
 209
 210 Our struct Env is analogous to struct proc in xv6. Both structures hold the
 211 environment's (i.e., process's) user-mode register state in a Trapframe
 212 structure. In JOS, individual environments do not have their own kernel stacks
 213 as processes do in xv6. There can be only JOS environment active in the kernel
 214 at a time, so JOS needs only a single kernel stack.
 215
 216 Allocating the Environments Array
 217
 218 In lab 2, you allocated memory in mem_init() for the pages[] array, which is a
 219 table the kernel uses to keep track of which pages are free and which are not.
 220 You will now need to modify mem_init() further to allocate a similar array of
 221 Env structures, called envs.
 222
 223 Exercise 1. Modify mem_init() in kern/pmap.c to allocate and map the envs
 224 array. This array consists of exactly NENV instances of the Env structure
 225 allocated much like how you allocated the pages array. Also like the pages
 226 array, the memory backing envs should also be mapped user read-only at UENVS
 227 (defined in inc/memlayout.h) so user processes can read from this array.
 228
 229 You should run your code and make sure check_kern_pgdir() succeeds.
 230
 231 Creating and Running Environments
 232
 233 You will now write the code in kern/env.c necessary to run a user environment.
 234 Because we do not yet have a filesystem, we will set up the kernel to load a
 235 static binary image that is embedded within the kernel itself. JOS embeds this
 236 binary in the kernel as a ELF executable image.
 237
 238 The Lab 3 GNUmakefile generates a number of binary images in the obj/user/
 239 directory. If you look at kern/Makefrag, you will notice some magic that
 240 "links" these binaries directly into the kernel executable as if they were .o
 241 files. The -b binary option on the linker command line causes these files to be
 242 linked in as "raw" uninterpreted binary files rather than as regular .o files
 243 produced by the compiler. (As far as the linker is concerned, these files do
 244 not have to be ELF images at all - they could be anything, such as text files
 245 or pictures!) If you look at obj/kern/kernel.sym after building the kernel, you
 246 will notice that the linker has "magically" produced a number of funny symbols
 247 with obscure names like _binary_obj_user_hello_start,
 248 _binary_obj_user_hello_end, and _binary_obj_user_hello_size. The linker
 249 generates these symbol names by mangling the file names of the binary files;
 250 the symbols provide the regular kernel code with a way to reference the
 251 embedded binary files.
 252
 253 In i386_init() in kern/init.c you'll see code to run one of these binary images
 254 in an environment. However, the critical functions to set up user environments
 255 are not complete; you will need to fill them in.
 256
 257 Exercise 2. In the file env.c, finish coding the following functions:
 258
 259 env_init()
 260     Initialize all of the Env structures in the envs array and add them to the
 261     env_free_list. Also calls env_init_percpu, which configures the
 262     segmentation hardware with separate segments for privilege level 0 (kernel)
 263     and privilege level 3 (user).
 264 env_setup_vm()
 265     Allocate a page directory for a new environment and initialize the kernel
 266     portion of the new environment's address space.
 267 region_alloc()
 268     Allocates and maps physical memory for an environment
 269 load_icode()
 270     You will need to parse an ELF binary image, much like the boot loader
 271     already does, and load its contents into the user address space of a new
 272     environment.
 273 env_create()
 274     Allocate an environment with env_alloc and call load_icode load an ELF
 275     binary into it.
 276 env_run()
 277     Start a given environment running in user mode.
 278
 279 As you write these functions, you might find the new cprintf verb %e useful --
 280 it prints a description corresponding to an error code. For example,
 281
 282         r = -E_NO_MEM;
 283         panic("env_alloc: %e", r);
 284
 285 will panic with the message "env_alloc: out of memory".
 286
 287 Below is a call graph of the code up to the point where the user code is
 288 invoked. Make sure you understand the purpose of each step.
 289
 290   • start (kern/entry.S)
 291   • i386_init (kern/init.c)
 292       □ cons_init
 293       □ mem_init
 294       □ env_init
 295       □ trap_init (still incomplete at this point)
 296       □ env_create
 297       □ env_run
 298           ☆ env_pop_tf
 299
 300 Once you are done you should compile your kernel and run it under QEMU. If all
 301 goes well, your system should enter user space and execute the hello binary
 302 until it makes a system call with the int instruction. At that point there will
 303 be trouble, since JOS has not set up the hardware to allow any kind of
 304 transition from user space into the kernel. When the CPU discovers that it is
 305 not set up to handle this system call interrupt, it will generate a general
 306 protection exception, find that it can't handle that, generate a double fault
 307 exception, find that it can't handle that either, and finally give up with
 308 what's known as a "triple fault". Usually, you would then see the CPU reset and
 309 the system reboot. While this is important for legacy applications (see this
 310 blog post for an explanation of why), it's a pain for kernel development, so
 311 with the 6.828 patched QEMU you'll instead see a register dump and a "Triple
 312 fault." message.
 313
 314 We'll address this problem shortly, but for now we can use the debugger to
 315 check that we're entering user mode. Use make qemu-gdb and set a GDB breakpoint
 316 at env_pop_tf, which should be the last function you hit before actually
 317 entering user mode. Single step through this function using si; the processor
 318 should enter user mode after the iret instruction. You should then see the
 319 first instruction in the user environment's executable, which is the cmpl
 320 instruction at the label start in lib/entry.S. Now use b *0x... to set a
 321 breakpoint at the int $0x30 in sys_cputs() in hello (see obj/user/hello.asm for
 322 the user-space address). This int is the system call to display a character to
 323 the console. If you cannot execute as far as the int, then something is wrong
 324 with your address space setup or program loading code; go back and fix it
 325 before continuing.
 326
 327 Handling Interrupts and Exceptions
 328
 329 At this point, the first int $0x30 system call instruction in user space is a
 330 dead end: once the processor gets into user mode, there is no way to get back
 331 out. You will now need to implement basic exception and system call handling,
 332 so that it is possible for the kernel to recover control of the processor from
 333 user-mode code. The first thing you should do is thoroughly familiarize
 334 yourself with the x86 interrupt and exception mechanism.
 335
 336 Exercise 3. Read Chapter 9, Exceptions and Interrupts in the 80386 Programmer's
 337 Manual (or Chapter 5 of the IA-32 Developer's Manual), if you haven't already.
 338
 339 In this lab we generally follow Intel's terminology for interrupts, exceptions,
 340 and the like. However, terms such as exception, trap, interrupt, fault and
 341 abort have no standard meaning across architectures or operating systems, and
 342 are often used without regard to the subtle distinctions between them on a
 343 particular architecture such as the x86. When you see these terms outside of
 344 this lab, the meanings might be slightly different.
 345
 346 Basics of Protected Control Transfer
 347
 348 Exceptions and interrupts are both "protected control transfers," which cause
 349 the processor to switch from user to kernel mode (CPL=0) without giving the
 350 user-mode code any opportunity to interfere with the functioning of the kernel
 351 or other environments. In Intel's terminology, an interrupt is a protected
 352 control transfer that is caused by an asynchronous event usually external to
 353 the processor, such as notification of external device I/O activity. An
 354 exception, in contrast, is a protected control transfer caused synchronously by
 355 the currently running code, for example due to a divide by zero or an invalid
 356 memory access.
 357
 358 In order to ensure that these protected control transfers are actually
 359 protected, the processor's interrupt/exception mechanism is designed so that
 360 the code currently running when the interrupt or exception occurs does not get
 361 to choose arbitrarily where the kernel is entered or how. Instead, the
 362 processor ensures that the kernel can be entered only under carefully
 363 controlled conditions. On the x86, two mechanisms work together to provide this
 364 protection:
 365
 366  1. The Interrupt Descriptor Table. The processor ensures that interrupts and
 367     exceptions can only cause the kernel to be entered at a few specific,
 368     well-defined entry-points determined by the kernel itself, and not by the
 369     code running when the interrupt or exception is taken.
 370
 371     The x86 allows up to 256 different interrupt or exception entry points into
 372     the kernel, each with a different interrupt vector. A vector is a number
 373     between 0 and 255. An interrupt's vector is determined by the source of the
 374     interrupt: different devices, error conditions, and application requests to
 375     the kernel generate interrupts with different vectors. The CPU uses the
 376     vector as an index into the processor's interrupt descriptor table (IDT),
 377     which the kernel sets up in kernel-private memory, much like the GDT. From
 378     the appropriate entry in this table the processor loads:
 379
 380       □ the value to load into the instruction pointer (EIP) register, pointing
 381         to the kernel code designated to handle that type of exception.
 382       □ the value to load into the code segment (CS) register, which includes
 383         in bits 0-1 the privilege level at which the exception handler is to
 384         run. (In JOS, all exceptions are handled in kernel mode, privilege
 385         level 0.)
 386  2. The Task State Segment. The processor needs a place to save the old
 387     processor state before the interrupt or exception occurred, such as the
 388     original values of EIP and CS before the processor invoked the exception
 389     handler, so that the exception handler can later restore that old state and
 390     resume the interrupted code from where it left off. But this save area for
 391     the old processor state must in turn be protected from unprivileged
 392     user-mode code; otherwise buggy or malicious user code could compromise the
 393     kernel.
 394
 395     For this reason, when an x86 processor takes an interrupt or trap that
 396     causes a privilege level change from user to kernel mode, it also switches
 397     to a stack in the kernel's memory. A structure called the task state
 398     segment (TSS) specifies the segment selector and address where this stack
 399     lives. The processor pushes (on this new stack) SS, ESP, EFLAGS, CS, EIP,
 400     and an optional error code. Then it loads the CS and EIP from the interrupt
 401     descriptor, and sets the ESP and SS to refer to the new stack.
 402
 403     Although the TSS is large and can potentially serve a variety of purposes,
 404     JOS only uses it to define the kernel stack that the processor should
 405     switch to when it transfers from user to kernel mode. Since "kernel mode"
 406     in JOS is privilege level 0 on the x86, the processor uses the ESP0 and SS0
 407     fields of the TSS to define the kernel stack when entering kernel mode. JOS
 408     doesn't use any other TSS fields.
 409
 410 Types of Exceptions and Interrupts
 411
 412 All of the synchronous exceptions that the x86 processor can generate
 413 internally use interrupt vectors between 0 and 31, and therefore map to IDT
 414 entries 0-31. For example, a page fault always causes an exception through
 415 vector 14. Interrupt vectors greater than 31 are only used by software
 416 interrupts, which can be generated by the int instruction, or asynchronous
 417 hardware interrupts, caused by external devices when they need attention.
 418
 419 In this section we will extend JOS to handle the internally generated x86
 420 exceptions in vectors 0-31. In the next section we will make JOS handle
 421 software interrupt vector 48 (0x30), which JOS (fairly arbitrarily) uses as its
 422 system call interrupt vector. In Lab 4 we will extend JOS to handle externally
 423 generated hardware interrupts such as the clock interrupt.
 424
 425 An Example
 426
 427 Let's put these pieces together and trace through an example. Let's say the
 428 processor is executing code in a user environment and encounters a divide
 429 instruction that attempts to divide by zero.
 430
 431  1. The processor switches to the stack defined by the SS0 and ESP0 fields of
 432     the TSS, which in JOS will hold the values GD_KD and KSTACKTOP,
 433     respectively.
 434  2. The processor pushes the exception parameters on the kernel stack, starting
 435     at address KSTACKTOP:
 436
 437                          +--------------------+ KSTACKTOP
 438                          | 0x00000 | old SS   |     " - 4
 439                          |      old ESP       |     " - 8
 440                          |     old EFLAGS     |     " - 12
 441                          | 0x00000 | old CS   |     " - 16
 442                          |      old EIP       |     " - 20 <---- ESP
 443                          +--------------------+
 444
 445
 446  3. Because we're handling a divide error, which is interrupt vector 0 on the
 447     x86, the processor reads IDT entry 0 and sets CS:EIP to point to the
 448     handler function described by the entry.
 449  4. The handler function takes control and handles the exception, for example
 450     by terminating the user environment.
 451
 452 For certain types of x86 exceptions, in addition to the "standard" five words
 453 above, the processor pushes onto the stack another word containing an error
 454 code. The page fault exception, number 14, is an important example. See the
 455 80386 manual to determine for which exception numbers the processor pushes an
 456 error code, and what the error code means in that case. When the processor
 457 pushes an error code, the stack would look as follows at the beginning of the
 458 exception handler when coming in from user mode:
 459
 460                      +--------------------+ KSTACKTOP
 461                      | 0x00000 | old SS   |     " - 4
 462                      |      old ESP       |     " - 8
 463                      |     old EFLAGS     |     " - 12
 464                      | 0x00000 | old CS   |     " - 16
 465                      |      old EIP       |     " - 20
 466                      |     error code     |     " - 24 <---- ESP
 467                      +--------------------+
 468
 469
 470 Nested Exceptions and Interrupts
 471
 472 The processor can take exceptions and interrupts both from kernel and user
 473 mode. It is only when entering the kernel from user mode, however, that the x86
 474 processor automatically switches stacks before pushing its old register state
 475 onto the stack and invoking the appropriate exception handler through the IDT.
 476 If the processor is already in kernel mode when the interrupt or exception
 477 occurs (the low 2 bits of the CS register are already zero), then the CPU just
 478 pushes more values on the same kernel stack. In this way, the kernel can
 479 gracefully handle nested exceptions caused by code within the kernel itself.
 480 This capability is an important tool in implementing protection, as we will see
 481 later in the section on system calls.
 482
 483 If the processor is already in kernel mode and takes a nested exception, since
 484 it does not need to switch stacks, it does not save the old SS or ESP
 485 registers. For exception types that do not push an error code, the kernel stack
 486 therefore looks like the following on entry to the exception handler:
 487
 488                      +--------------------+ <---- old ESP
 489                      |     old EFLAGS     |     " - 4
 490                      | 0x00000 | old CS   |     " - 8
 491                      |      old EIP       |     " - 12
 492                      +--------------------+
 493
 494 For exception types that push an error code, the processor pushes the error
 495 code immediately after the old EIP, as before.
 496
 497 There is one important caveat to the processor's nested exception capability.
 498 If the processor takes an exception while already in kernel mode, and cannot
 499 push its old state onto the kernel stack for any reason such as lack of stack
 500 space, then there is nothing the processor can do to recover, so it simply
 501 resets itself. Needless to say, the kernel should be designed so that this
 502 can't happen.
 503
 504 Setting Up the IDT
 505
 506 You should now have the basic information you need in order to set up the IDT
 507 and handle exceptions in JOS. For now, you will set up the IDT to handle
 508 interrupt vectors 0-31 (the processor exceptions). We'll handle system call
 509 interrupts later in this lab and add interrupts 32-47 (the device IRQs) in a
 510 later lab.
 511
 512 The header files inc/trap.h and kern/trap.h contain important definitions
 513 related to interrupts and exceptions that you will need to become familiar
 514 with. The file kern/trap.h contains definitions that are strictly private to
 515 the kernel, while inc/trap.h contains definitions that may also be useful to
 516 user-level programs and libraries.
 517
 518 Note: Some of the exceptions in the range 0-31 are defined by Intel to be
 519 reserved. Since they will never be generated by the processor, it doesn't
 520 really matter how you handle them. Do whatever you think is cleanest.
 521
 522 The overall flow of control that you should achieve is depicted below:
 523
 524       IDT                   trapentry.S         trap.c
 525
 526 +----------------+
 527 |   &handler1    |---------> handler1:          trap (struct Trapframe *tf)
 528 |                |             // do stuff      {
 529 |                |             call trap          // handle the exception/interrupt
 530 |                |             // ...           }
 531 +----------------+
 532 |   &handler2    |--------> handler2:
 533 |                |            // do stuff
 534 |                |            call trap
 535 |                |            // ...
 536 +----------------+
 537        .
 538        .
 539        .
 540 +----------------+
 541 |   &handlerX    |--------> handlerX:
 542 |                |             // do stuff
 543 |                |             call trap
 544 |                |             // ...
 545 +----------------+
 546
 547 Each exception or interrupt should have its own handler in trapentry.S and
 548 trap_init() should initialize the IDT with the addresses of these handlers.
 549 Each of the handlers should build a struct Trapframe (see inc/trap.h) on the
 550 stack and call trap() (in trap.c) with a pointer to the Trapframe. trap() then
 551 handles the exception/interrupt or dispatches to a specific handler function.
 552
 553 Exercise 4. Edit trapentry.S and trap.c and implement the features described
 554 above. The macros TRAPHANDLER and TRAPHANDLER_NOEC in trapentry.S should help
 555 you, as well as the T_* defines in inc/trap.h. You will need to add an entry
 556 point in trapentry.S (using those macros) for each trap defined in inc/trap.h,
 557 and you'll have to provide _alltraps which the TRAPHANDLER macros refer to. You
 558 will also need to modify trap_init() to initialize the idt to point to each of
 559 these entry points defined in trapentry.S; the SETGATE macro will be helpful
 560 here.
 561
 562 Your _alltraps should:
 563
 564  1. push values to make the stack look like a struct Trapframe
 565  2. load GD_KD into %ds and %es
 566  3. pushl %esp to pass a pointer to the Trapframe as an argument to trap()
 567  4. call trap (can trap ever return?)
 568
 569 Consider using the pushal instruction; it fits nicely with the layout of the
 570 struct Trapframe.
 571
 572 Test your trap handling code using some of the test programs in the user
 573 directory that cause exceptions before making any system calls, such as user/
 574 divzero. You should be able to get make grade to succeed on the divzero,
 575 softint, and badsegment tests at this point.
 576
 577 Challenge! You probably have a lot of very similar code right now, between the
 578 lists of TRAPHANDLER in trapentry.S and their installations in trap.c. Clean
 579 this up. Change the macros in trapentry.S to automatically generate a table for
 580 trap.c to use. Note that you can switch between laying down code and data in
 581 the assembler by using the directives .text and .data.
 582
 583 Questions
 584
 585 Answer the following questions in your answers-lab3.txt:
 586
 587  1. What is the purpose of having an individual handler function for each
 588     exception/interrupt? (i.e., if all exceptions/interrupts were delivered to
 589     the same handler, what feature that exists in the current implementation
 590     could not be provided?)
 591  2. Did you have to do anything to make the user/softint program behave
 592     correctly? The grade script expects it to produce a general protection
 593     fault (trap 13), but softint's code says int $14. Why should this produce
 594     interrupt vector 13? What happens if the kernel actually allows softint's
 595     int $14 instruction to invoke the kernel's page fault handler (which is
 596     interrupt vector 14)?
 597
 598 This concludes part A of the lab. Don't forget to run make handin before the
 599 part A deadline. (If you've already completed part B by that time, you only
 600 need to submit once.)
 601
 602 Part B: Page Faults, Breakpoints Exceptions, and System Calls
 603
 604 Now that your kernel has basic exception handling capabilities, you will refine
 605 it to provide important operating system primitives that depend on exception
 606 handling.
 607
 608 Handling Page Faults
 609
 610 The page fault exception, interrupt vector 14 (T_PGFLT), is a particularly
 611 important one that we will exercise heavily throughout this lab and the next.
 612 When the processor takes a page fault, it stores the linear (i.e., virtual)
 613 address that caused the fault in a special processor control register, CR2. In
 614 trap.c we have provided the beginnings of a special function,
 615 page_fault_handler(), to handle page fault exceptions.
 616
 617 Exercise 5. Modify trap_dispatch() to dispatch page fault exceptions to
 618 page_fault_handler(). You should now be able to get make grade to succeed on
 619 the faultread, faultreadkernel, faultwrite, and faultwritekernel tests. If any
 620 of them don't work, figure out why and fix them. Remember that you can boot JOS
 621 into a particular user program using make run-x or make run-x-nox.
 622
 623 You will further refine the kernel's page fault handling below, as you
 624 implement system calls.
 625
 626 The Breakpoint Exception
 627
 628 The breakpoint exception, interrupt vector 3 (T_BRKPT), is normally used to
 629 allow debuggers to insert breakpoints in a program's code by temporarily
 630 replacing the relevant program instruction with the special 1-byte int3
 631 software interrupt instruction. In JOS we will abuse this exception slightly by
 632 turning it into a primitive pseudo-system call that any user environment can
 633 use to invoke the JOS kernel monitor. This usage is actually somewhat
 634 appropriate if we think of the JOS kernel monitor as a primitive debugger. The
 635 user-mode implementation of panic() in lib/panic.c, for example, performs an
 636 int3 after displaying its panic message.
 637
 638 Exercise 6. Modify trap_dispatch() to make breakpoint exceptions invoke the
 639 kernel monitor. You should now be able to get make grade to succeed on the
 640 breakpoint test.
 641
 642 Challenge! Modify the JOS kernel monitor so that you can 'continue' execution
 643 from the current location (e.g., after the int3, if the kernel monitor was
 644 invoked via the breakpoint exception), and so that you can single-step one
 645 instruction at a time. You will need to understand certain bits of the EFLAGS
 646 register in order to implement single-stepping.
 647
 648 Optional: If you're feeling really adventurous, find some x86 disassembler
 649 source code - e.g., by ripping it out of QEMU, or out of GNU binutils, or just
 650 write it yourself - and extend the JOS kernel monitor to be able to disassemble
 651 and display instructions as you are stepping through them. Combined with the
 652 symbol table loading from lab 2, this is the stuff of which real kernel
 653 debuggers are made.
 654
 655 Questions
 656
 657  3. The break point test case will either generate a break point exception or a
 658     general protection fault depending on how you initialized the break point
 659     entry in the IDT (i.e., your call to SETGATE from trap_init). Why? How do
 660     you need to set it up in order to get the breakpoint exception to work as
 661     specified above and what incorrect setup would cause it to trigger a
 662     general protection fault?
 663  4. What do you think is the point of these mechanisms, particularly in light
 664     of what the user/softint test program does?
 665
 666 System calls
 667
 668 User processes ask the kernel to do things for them by invoking system calls.
 669 When the user process invokes a system call, the processor enters kernel mode,
 670 the processor and the kernel cooperate to save the user process's state, the
 671 kernel executes appropriate code in order to carry out the system call, and
 672 then resumes the user process. The exact details of how the user process gets
 673 the kernel's attention and how it specifies which call it wants to execute vary
 674 from system to system.
 675
 676 In the JOS kernel, we will use the int instruction, which causes a processor
 677 interrupt. In particular, we will use int $0x30 as the system call interrupt.
 678 We have defined the constant T_SYSCALL to 48 (0x30) for you. You will have to
 679 set up the interrupt descriptor to allow user processes to cause that
 680 interrupt. Note that interrupt 0x30 cannot be generated by hardware, so there
 681 is no ambiguity caused by allowing user code to generate it.
 682
 683 The application will pass the system call number and the system call arguments
 684 in registers. This way, the kernel won't need to grub around in the user
 685 environment's stack or instruction stream. The system call number will go in
 686 %eax, and the arguments (up to five of them) will go in %edx, %ecx, %ebx, %edi,
 687 and %esi, respectively. The kernel passes the return value back in %eax. The
 688 assembly code to invoke a system call has been written for you, in syscall() in
 689 lib/syscall.c. You should read through it and make sure you understand what is
 690 going on.
 691
 692 Exercise 7. Add a handler in the kernel for interrupt vector T_SYSCALL. You
 693 will have to edit kern/trapentry.S and kern/trap.c's trap_init(). You also need
 694 to change trap_dispatch() to handle the system call interrupt by calling
 695 syscall() (defined in kern/syscall.c) with the appropriate arguments, and then
 696 arranging for the return value to be passed back to the user process in %eax.
 697 Finally, you need to implement syscall() in kern/syscall.c. Make sure syscall()
 698 returns -E_INVAL if the system call number is invalid. You should read and
 699 understand lib/syscall.c (especially the inline assembly routine) in order to
 700 confirm your understanding of the system call interface. You may also find it
 701 helpful to read inc/syscall.h.
 702
 703 Run the user/hello program under your kernel (make run-hello). It should print
 704 "hello, world" on the console and then cause a page fault in user mode. If this
 705 does not happen, it probably means your system call handler isn't quite right.
 706 You should also now be able to get make grade to succeed on the testbss test.
 707
 708 Challenge! Implement system calls using the sysenter and sysexit instructions
 709 instead of using int 0x30 and iret.
 710
 711 The sysenter/sysexit instructions were designed by Intel to be faster than int/
 712 iret. They do this by using registers instead of the stack and by making
 713 assumptions about how the segmentation registers are used. The exact details of
 714 these instructions can be found in Volume 2B of the Intel reference manuals.
 715
 716 The easiest way to add support for these instructions in JOS is to add a
 717 sysenter_handler in kern/trapentry.S that saves enough information about the
 718 user environment to return to it, sets up the kernel environment, pushes the
 719 arguments to syscall() and calls syscall() directly. Once syscall() returns,
 720 set everything up for and execute the sysexit instruction. You will also need
 721 to add code to kern/init.c to set up the necessary model specific registers
 722 (MSRs). Section 6.1.2 in Volume 2 of the AMD Architecture Programmer's Manual
 723 and the reference on SYSENTER in Volume 2B of the Intel reference manuals give
 724 good descriptions of the relevant MSRs. You can find an implementation of wrmsr
 725 to add to inc/x86.h for writing to these MSRs here.
 726
 727 Finally, lib/syscall.c must be changed to support making a system call with
 728 sysenter. Here is a possible register layout for the sysenter instruction:
 729
 730         eax                - syscall number
 731         edx, ecx, ebx, edi - arg1, arg2, arg3, arg4
 732         esi                - return pc
 733         ebp                - return esp
 734         esp                - trashed by sysenter
 735
 736
 737 GCC's inline assembler will automatically save registers that you tell it to
 738 load values directly into. Don't forget to either save (push) and restore (pop)
 739 other registers that you clobber, or tell the inline assembler that you're
 740 clobbering them. The inline assembler doesn't support saving %ebp, so you will
 741 need to add code to save and restore it yourself. The return address can be put
 742 into %esi by using an instruction like leal after_sysenter_label, %%esi.
 743
 744 Note that this only supports 4 arguments, so you will need to leave the old
 745 method of doing system calls around to support 5 argument system calls.
 746 Furthermore, because this fast path doesn't update the current environment's
 747 trap frame, it won't be suitable for some of the system calls we add in later
 748 labs.
 749
 750 You may have to revisit your code once we enable asynchronous interrupts in the
 751 next lab. Specifically, you'll need to enable interrupts when returning to the
 752 user process, which sysexit doesn't do for you.
 753
 754 User-mode startup
 755
 756 A user program starts running at the top of lib/entry.S. After some setup, this
 757 code calls libmain(), in lib/libmain.c. You should modify libmain() to
 758 initialize the global pointer thisenv to point at this environment's struct Env
 759 in the envs[] array. (Note that lib/entry.S has already defined envs to point
 760 at the UENVS mapping you set up in Part A.) Hint: look in inc/env.h and use
 761 sys_getenvid.
 762
 763 libmain() then calls umain, which, in the case of the hello program, is in user
 764 /hello.c. Note that after printing "hello, world", it tries to access thisenv->
 765 env_id. This is why it faulted earlier. Now that you've initialized thisenv
 766 properly, it should not fault. If it still faults, you probably haven't mapped
 767 the UENVS area user-readable (back in Part A in pmap.c; this is the first time
 768 we've actually used the UENVS area).
 769
 770 Exercise 8. Add the required code to the user library, then boot your kernel.
 771 You should see user/hello print "hello, world" and then print "i am environment
 772 00001000". user/hello then attempts to "exit" by calling sys_env_destroy() (see
 773 lib/libmain.c and lib/exit.c). Since the kernel currently only supports one
 774 user environment, it should report that it has destroyed the only environment
 775 and then drop into the kernel monitor. You should be able to get make grade to
 776 succeed on the hello test.
 777
 778 Page faults and memory protection
 779
 780 Memory protection is a crucial feature of an operating system, ensuring that
 781 bugs in one program cannot corrupt other programs or corrupt the operating
 782 system itself.
 783
 784 Operating systems usually rely on hardware support to implement memory
 785 protection. The OS keeps the hardware informed about which virtual addresses
 786 are valid and which are not. When a program tries to access an invalid address
 787 or one for which it has no permissions, the processor stops the program at the
 788 instruction causing the fault and then traps into the kernel with information
 789 about the attempted operation. If the fault is fixable, the kernel can fix it
 790 and let the program continue running. If the fault is not fixable, then the
 791 program cannot continue, since it will never get past the instruction causing
 792 the fault.
 793
 794 As an example of a fixable fault, consider an automatically extended stack. In
 795 many systems the kernel initially allocates a single stack page, and then if a
 796 program faults accessing pages further down the stack, the kernel will allocate
 797 those pages automatically and let the program continue. By doing this, the
 798 kernel only allocates as much stack memory as the program needs, but the
 799 program can work under the illusion that it has an arbitrarily large stack.
 800
 801 System calls present an interesting problem for memory protection. Most system
 802 call interfaces let user programs pass pointers to the kernel. These pointers
 803 point at user buffers to be read or written. The kernel then dereferences these
 804 pointers while carrying out the system call. There are two problems with this:
 805
 806  1. A page fault in the kernel is potentially a lot more serious than a page
 807     fault in a user program. If the kernel page-faults while manipulating its
 808     own data structures, that's a kernel bug, and the fault handler should
 809     panic the kernel (and hence the whole system). But when the kernel is
 810     dereferencing pointers given to it by the user program, it needs a way to
 811     remember that any page faults these dereferences cause are actually on
 812     behalf of the user program.
 813  2. The kernel typically has more memory permissions than the user program. The
 814     user program might pass a pointer to a system call that points to memory
 815     that the kernel can read or write but that the program cannot. The kernel
 816     must be careful not to be tricked into dereferencing such a pointer, since
 817     that might reveal private information or destroy the integrity of the
 818     kernel.
 819
 820 For both of these reasons the kernel must be extremely careful when handling
 821 pointers presented by user programs.
 822
 823 You will now solve these two problems with a single mechanism that scrutinizes
 824 all pointers passed from userspace into the kernel. When a program passes the
 825 kernel a pointer, the kernel will check that the address is in the user part of
 826 the address space, and that the page table would allow the memory operation.
 827
 828 Thus, the kernel will never suffer a page fault due to dereferencing a
 829 user-supplied pointer. If the kernel does page fault, it should panic and
 830 terminate.
 831
 832 Exercise 9. Change kern/trap.c to panic if a page fault happens in kernel mode.
 833
 834 Hint: to determine whether a fault happened in user mode or in kernel mode,
 835 check the low bits of the tf_cs.
 836
 837 Read user_mem_assert in kern/pmap.c and implement user_mem_check in that same
 838 file.
 839
 840 Change kern/syscall.c to sanity check arguments to system calls.
 841
 842 Boot your kernel, running user/buggyhello. The environment should be destroyed,
 843 and the kernel should not panic. You should see:
 844
 845         [00001000] user_mem_check assertion failure for va 00000001
 846         [00001000] free env 00001000
 847         Destroyed the only environment - nothing more to do!
 848
 849
 850 Finally, change debuginfo_eip in kern/kdebug.c to call user_mem_check on usd,
 851 stabs, and stabstr. If you now run user/breakpoint, you should be able to run
 852 backtrace from the kernel monitor and see the backtrace traverse into lib/
 853 libmain.c before the kernel panics with a page fault. What causes this page
 854 fault? You don't need to fix it, but you should understand why it happens.
 855
 856 Note that the same mechanism you just implemented also works for malicious user
 857 applications (such as user/evilhello).
 858
 859 Exercise 10. Boot your kernel, running user/evilhello. The environment should
 860 be destroyed, and the kernel should not panic. You should see:
 861
 862         [00000000] new env 00001000
 863         [00001000] user_mem_check assertion failure for va f0100020
 864         [00001000] free env 00001000
 865
 866
 867 This completes the lab. Make sure you pass all of the make grade tests and
 868 don't forget to write up your answers to the questions and a description of
 869 your challenge exercise solution in answers-lab3.txt. Type make handin in the
 870 lab directory, and follow the directions for uploading your lab tar file to our
 871 server.
 872