Homework 2: Configure Interrupts and Exceptions

This assignment will teach you how to extend your minimal kernel with support for handling interrupts and exceptions. Technically, you can do this assignment on Linux, Mac or WSL (Windows Subsystem for Linux). Submit your code through Gradescope (see instructions at the bottom of this page).

NOTE: YOU CANNOT PUBLICLY RELEASE SOLUTIONS TO THIS HOMEWORK.

It’s ok to show your work to your future employer as a private Git repo, however any public release is prohibited.

Overview

This homework asks you to extend your HW1 submission with support for handling interrupts and exceptions.

You already have a reasonably good handle on how interrupts and exceptions work from the lectures, but lets try to review these concepts one more time. Just to have a plan the main things you need to do to enable exception handling are:

Configure global descriptor table (GDT)
Configure interrupt descriptor table (IDT)
Configure Task State Segment (TSS)
Allocate interrupt stacks and add it to TSS
Implement entry and exit wrappers for the interrupt handlers
Enable hardware interrupts by configuring I/O APIC and LAPIC

Suggested approach

While you can implement this assignment any way you like, the easiest way is likely to merge the HW2 branch of hello-os on your HW1 submissubmission. It will not build, but it’s not too hard to make it work and fill in the gaps needed for this assignment to work.

Interrupts (overview)

At a high level, when the interrupt or exception is raised the hardware saves a 5 CPU registers on the interrupt stack and starts excution of the interrupt handler:

SS (stack segment)
RSP (stack pointer)
RFLAGS (flags register)
CS (code segment)
RIP (instruction pointer)

This new interrupt stack is used when the interrupt forces a privilege level switch. The reason for this is that the kernel cannot trust the value of the RSP that can be changed by the user to point to an unmapped page. Hence it cannot ask hardware to save anything on that user stack.

There is a couple of interesting questions during the interrupt transition:

How to locate this interrupt stack?
How to locate the interrupt handler (another name is interrupt service routine or ISR)?
Do we need to swtich privilege the level?
Is interrupt or exception even allowed?

To get a better understanding on how everything is connected together, the figure below shows dependencies between datastructures involved in handling an interrupt.

                IDT                        GDTR -------+                                                              
IDTR ---->+-----------+                                |                                                              
          | entry 0   |  +----------------------+      v      GDT                                                         
          |           |/ | handler   CS   flags |     +------------------+                                             
          | entry 1   |\ +---+--------+---------+     |                  |                                             
          |           |     |         +-------------->| kernel code(PL0) |                                             
          |           |     |                         |                  |        TSS                                  
          |           |     |               TS ------>| TSS              +------>+-------------+                       
          |           |     |                         |                  |       |             |                       
          |           |     |                         +------------------+       |             |       Interrupt Stack    
          | entry 255 |     |                                                    | int stack   +----->+---------------+
          +-----------+     |                                                    |             |      |    SS         |
                            +--------> handler()                                 |             |      +---------------+
                                                                                 +-------------+      |    RSP        |
                                                                                                      +---------------+
                                                                                                      |   RFLAGS      |
                                                                                                      +---------------+
                                                                                                      |    CS         |
                                                                                                      +---------------+
                                                                                                      |    RIP        |
                                                                                                      |               |
                                                                                                      |    ...        |
                                                                                                      |               |
                                                                                                      |               |
                                                                                                      |               |
                                                                                                      |               |
                                                                                                      |               |
                                                                                                      +---------------+

Three hardware registers: IDTR, GDTR, and TS are important as they allow the hardware to locate the locations of the IDT, GDT, and TSS in memory. Here, when I say “hardware” I mean actual silicon and microcode logic of the CPU.

The hardware starts by looking up the interrupt descriptor table register (IDTR) that points to the interrupt descriptor table (IDT) in memory (this is a virtual address). Each interrupt has a number (0-255) so the hardware uses this number as an index in the interrupt table.

The first 32 interrupts are reserved by Intel and have special meaning, e.g., page fault, division by zero, etc. I think it makes sense for you to review what these exceptions are by looking at the Table 7-1 in the Intel Software Developer Manual, Volume 3, Chapter 7.

Note, I agree that this is super odd that Intel starts description of interrupt handling with how it works in the 32bit mode and then generalizes it for 64bit machines (instead of starting from the 64bit mode implementation directly) … Well, what can we do. Maybe there is some logic to it like backward compatibility with previous versions of the manual, or maybe the entire CPU is designed with this compatibility in mind and hence the 64bit mode is just an extension of 32bit mode, I don’t know.

Each entry in the interrupt descriptor table has a couple of important fields (Figure 7-8. 64-Bit IDT Gate Descriptors provides specific layout of the Interrupt Gate Descriptor, i.e., entry in the table on 64bit machines).

Pointer to the handler itself (Intel calls it “offset”)
Code segment selector to pick the new entry in the GDT potentially switching the privilege level (remember privilege levels are stored in the GDT). In this homework we will configure kernel code entry with PL0 (ring 0) and switch to it on interrupt.
And a couple of important flags: whether subsequent interrupts are disabled and whether the interrupt is allowed (this is a protection mechanism so the user cannot invoke int 14 and trigger a page fault).

To locate the interrupt stack the hardware uses the TSS entry in the GDT, i.e., one of the entries in the GDT instead of pointing to a code or a data segment, points to the location of theTSS table in memory. The CPU uses the TS selector (similar to CS and SS) to figure out which entry in GDT is TSS entry (you have to initialize TS correctly).

TSS itself will have a pointer to the stack to switch to at ring 0. Note, there are some other funky stacks there – interrupt stacks accessible via the interrupt stack table (IST) which work more like emergency backup stacks when the normal ring 0 stack is not available (we gonna talk about them later and probably use one of the IST stacks for handling inter-processor interrupts and double faults).

GDT

Lets start by configuring GDT. We really need just 6 entries (the first one is always 0, reserved by Intel), the order of others doesn’t matter, but we can stick to the order below.

+---------------------------------+
|   Reserved (NULL)               |
+---------------------------------+
|   Kernel data (writable, PL0)   | 
+---------------------------------+
|   Kernel code (executable, PL0) |
+------------------------------U--+
|   User data (writable, PL3)     |
+---------------------------------+
|   User code (executable, PL3)   |
+---------------------------------+
|   TSS (it's a double sized      |
|    entry, so it can actually    |
|                                 |
+---------------------------------+

Ok, this looks easy. We just need to declare this data structure in Rust and use all the high-level language goodies to make everything neat and type safe. We provide example definitions GDT definition

To load GDT into the hardware register (GDTR register) we need to define another data structure which contains the pointer to the GDT and its size (kind of like a smart pointer but defined by the hardware interface), it’s called DescriptorTablePointer

Bitfield macro

As an example, of how CPU-defined data structures that use a ton of bitfields can be conveniently used in Rust, our code uses the bitfield! macro to provide convenient interface for accessing GDT types see example here.

Every segment descriptor in the GDT/LDT has a so-called Access Byte (or “Type”

“Flags”) that defines how the segment can be used. In long mode, base/limit aren’t enforced anymore, but the access byte fields still matter for privilege checking, present bits, etc.

Rust bitfield! macro allows us to describe the Access Byte of the x86 segmentation descriptor (specifically Code/Data descriptors, since TSS descriptors will have a different format.

The bitfield! macro generates a wrapper type (AccessByte) around a primitive integer (u8) and then automatically generates getters and setters for specific bit ranges.

So for each line like:

present, set_present: 7;

it generates: a getter method fn present(&self) -> bool, a setter method fn set_present(&mut self, val: bool), where bit 7 of the byte is read/written.

If a range is given, like:

pub privilege, set_privilege: 6, 5;

then it generates getters/setters for 2 bits (bits 6 and 5 together, i.e. values 0-3).

TSS

In the context of interrupt processing the main role of TSS is to store locations of the interrupt, ring 0, stack as well as IST stacks. Well, we talked a lot about stack switching in class, but let’s revisit it again. TSS can hold pointers to 8 different stacks (well, really more, it can have stacks for ring1, ring2, etc., but we not going to use them):

One regular ring0 stack (pointed by the rsp[0] field of the TSS)
Seven IST stacks that can be thought of as special “emergency” stacks that can be used for processing interrupts in cases when the normal ring0 stack is not available (e.g., a double fault can execute on a dedicated IST stack since it’s possible that the second fault is due to the fact that ring0 stack is out of space).

To set the normal rsp[0] stack we use the set_rsp method from the x86 crate by Gerd Zellweger. We call set_rsp() when we initialize TSS in init_cpu().

To set the other stacks we use a similar function set_ist(). We initialize all 7 stacks although likely we will just use one or two.

IDT

Similar to GDT we define a collection of Rust data structures that describe it, IDT definition

Here we use two different interrupt handler types: one that pushes a dummy error code 0 on the stack and one that assumes that the error code is actually pushed by the hardware already (we discuss this below).

There is a 4-bit field (ist) in the IDT descriptor (an entry in the IDT table) which can pick one of the 7 IST stacks, or when it’s 0 picking the default ring 0 stack. For not we initialize all exception handlers to run on the default kernel stack.

Note, when we start running threads we will use a per-thread stack as an interrupt stack (i.e., on a context switch we’ll update the rsp[0] field of the TSS). But since we don’t have any threads running yet, we’ll just set rsp[0] to IST[0].

We load the address of the IDT into the hardware IDTR register with the load method which creates a special DescriptorTablePointer similar to how we did for GDT.

Interrupt handlers

On every interrupt the CPU pushes 5 or in some cases 6 values (when the error code is pushed) on the stack in the following format (stack grows down):

+----------------------------+
|   SS                       |
+----------------------------+
|   RSP                      | 
+----------------------------+
|   RFLAGS                   |
+----------------------------+
|   CS                       |
+----------------------------+
|   RIP                      |
+----------------------------+
|   ERROR CODE (optional)    | < -- RSP when interrup handler is reached
+----------------------------+

Bonus question: do you remember why does the hardware save these 5 values on interrupt transition?

Don’t look …. think more… don’t look … looking again? … right… those registers get changed immediately during the interrup transition and if we want to know the original values someone has to save them. E.g., RFLAGS might get changed if the interrupt disables further interrupts, SS:RSP are changed since we switch to the new stack. CS:RIP are changed since we start executing of the interrupt handler, potentially changing the privilege level)

Remember we talked a lot about stack normalization, which is needed in 32bit mode since hardware does not push SS:RSP if the privilege level is not changed, but thankfully in 64 bit mode everything is pushed unconditionally whether you changed the privilege level or not. And this is super convenient.

One detail here is that error code is pushed only by some exceptions (see the Intel exception table in Chapter 7) and never by interrupts.

Ok, now we have to figure out how to build an interrupt handler in Rust. A couple of things are important:

We want to have access to the 6 values saved on the stack so we can look up error code, user RIP, etc.
We need to normalize stack between exceptions that do and don’t push the error code (i.e., we can just push 0 for the ones that don’t push the error code).
In most cases we also want access to user registers (note, hardware does not save anything beside these 5/6 values above, so if we want to save user registers we need to do it by hand when we start executing the interrupt handler).

We discussed the idea of the x86_interrupt calling convention in class, but as real kernel hackers we are not going to use it. We want to have access to all user registers inside the handler, but also we want to be able to control how everything gets saved on each interrupt entry (we will talk about swapgs and other corner cases of entering into the kernel safely later on in the class).

Instead we are going to use naked functions and naked assembly in Rust.

Specifically, our plan is to create a Rust data structure InterruptStackFrame that matches the following stack layout and pass it to the interrupt handler.

+----------------------------+
|   SS                       |
+----------------------------+
|   RSP                      | 
+----------------------------+
|   RFLAGS                   |
+----------------------------+
|   CS                       |
+----------------------------+
|   RIP                      |
+----------------------------+
|   ERROR CODE (optional)    | < -- RSP when interrup handler is reached
+----------------------------+ 
|   RAX                      |
+----------------------------+
|   RDI                      | 
+----------------------------+
|   ... all other regs       |
+----------------------------+
|   ... no need to save RSP  |
+----------------------------+
|   R14                      |
+----------------------------+
|   R15                      | <-- RSP when we pass control to Rust
+----------------------------+

We then use a combination of naked functions and naked assembly. Naked functions allow us to have complete control of how the code is generated, i.e., know that compiler will not use any of the registers or alter the stack in any way (see how we use naked funcitons here).

Naked functions that are restricted to only naked assembly in the body quarantee that no prologue or epilogue will be generated by the compiler. This solves the problem of implementing handler entry trampolines in assembly – hacked functions provide the same level of control.

Here we use Rust macros to generate a trampoline (naked function) that saves all general registers on the stack and then call the handler itself. We use “extern C” to make sure the handler uses the C ABI calling convention.

Enabling hardware interrupts (configuring I/O APIC and LAPIC)

Like we discussed in class we plan to rely on modern interrupt controllers, i.e. a couple of

I/O APIC (one per system, implemented on the motherboard, part of the Platform Controller Hub (PCH) on Intel, or I/O hub or southbridge on AMD)
LAPIC (one per-CPU core)

Remember that the way they work is that I/O APIC receives inputs from physical devices and then routes them to different CPU cores. You can configure the routing scheme, i.e., route a specific interrupt to a specific CPU, or choose a round-robin scheme across a subset of CPUs.

We follow a typical route of initializing these two interrupt controllers.

We start by disabling the legacy PIC init()

We then figure out the address of the I/O APIC configuration area (it doesn’t work for us, so we hardcoded the address for now – can leverage your help debugging this!)

Then we initialize the I/O APIC by using the x86 crate, specifically the init() function . This should be done only once on the boot CPU. However, we need to enable the timer IRQ 0 on each CPU with the init_cpu() function Internally, we utilize the [enable()]](https://docs.rs/x86/latest/src/x86/apic/ioapic.rs.html#72) method of the x86 crate to configure IRQ 0.

Then on each CPU we initialize LAPIC (the local APIC) by using the x86 crate again (Gerd rocks!), see the lapic::init() function. We probe LAPIC to figure out its configuration memory area and then use the attach() function from the x86 crate to enable LAPIC. Here, we use an older XAPIC interface as it’s a bit simpler (maybe we can actually switch to X2APIC, a newer interface, as it looks like Gerd got it working in x86 crate). Note, that while cal attach() we enable LAPIC and set the spurious interrupt vector to be IRQ15 (ideally we should implement this handler).

We then enable the timer by calling the tsc_enable() to configure the timer in “periodic” mode. It’s a bit tricky to configure the frequency. We set the XAPIC_TIMER_INIT_COUNT (initial count value) with the tsc_set_oneshot() function (probably a bad name, we shoould change it). This initial value along with a couple of other registers define the timer interrupt frequency (divisor in the XAPIC_TIMER_DIV_CONF register and the base APIC timer clock frequency).

Finally, tsc_enable() takes the vector argument, the interrupt number that we put in the LVT table to route the timer interrupt to (IRQ 0) to vector 32.

Note, the timer supports two other modes: one-shot and TSC deadline, this is configured with the bits of the XAPIC_LVT_TIMER register:

Bits 18:17 select the timer mode (see Section 11.5.4)
(00b) one-shot mode using a count-down value
(01b) periodic mode reloading a count-down value
(10b) TSC-Deadline mode using absolute target value in IA32_TSC_DEADLINE MSR

In the TSC deadline mode instead of counting down a value like in “one-shot” or “periodic” modes, the timer simply fires when the TSC reaches a specific value.

In the “one-shot” mode the Local APIC timer is programmed to fire a single interrupt after a specified time delay, then stop. It does not automatically reload or repeat. I.e., you will have to ask it to fire another timer interrupt from within the interrup handler.

Acknowledging interrupts

Note, that every time you process an interrupt, to receive the next one you have to acknowledge the current one.

Inside every interrupt handler (not a CPU exception handler but a timer interrupt handler that receives an interrupt form external hardware via APIC) you need to add the end_of_interrupt() function from the lapic.rs.

Making everything work together

At this point we’re ready to stich everything together. Your goal for this homework is to configure all parts necessary for interrupt processing and get the timer interrupt working. Specifically, to test this homework we ask you to print a . on serial line for each timer interrupt.

We left some comments in the source code hinting what needs to be done, but if you carefully read this homework and the source code it should be clear what to do (don’t hesitate to ask questions).

Submitting your work

Autograder will build and poll on your qemu output for 20 - 30 secs. Make sure set your timer frequency low. Also make sure you are printing . with no new line on each timer interrupt. You may use submission.sh to zip up your files for gradescope submission. Submit your work on Gradescope similar to HW1.

Homework 2: Configure Interrupts and Exceptions#

Overview#

Suggested approach#

Interrupts (overview)#

GDT#

Bitfield macro#

TSS#

IDT#

Interrupt handlers#

Enabling hardware interrupts (configuring I/O APIC and LAPIC)#

Acknowledging interrupts#

Making everything work together#

Submitting your work#