Both traps from sensitive operations in VMs and hypercalls from the host kernel enter HYP mode through an exception on the CPU.

In the 64-bit Arm architecture, each entry in the vector table is 32 instructions long and the entry that is branched to is determined by both the type of the exception and from where the exception was taken

From “raspvisor” project:

.align  11
.globl vectors
vectors:
  ventry  sync_invalid_el2         // Synchronous EL2
  ventry  irq_invalid_el2          // IRQ EL2
  ventry  fiq_invalid_el2          // FIQ EL2
  ventry  error_invalid_el2        // Error EL2
 
  ventry  sync_invalid_el2         // Synchronous EL2
  ventry  el2_irq                  // IRQ EL2
  ventry  fiq_invalid_el2          // FIQ EL2
  ventry  error_invalid_el2        // Error EL2
 
  ventry  el01_sync                // Synchronous 64-bit EL0 or 1
  ventry  el01_irq                 // IRQ 64-bit EL0 or 1
  ventry  fiq_invalid_el01_64      // FIQ 64-bit EL0 or 1
  ventry  error_invalid_el01_64    // Error 64-bit EL0 or 1
 
  ventry  sync_invalid_el01_32     // Synchronous 32-bit EL0 or 1
  ventry  irq_invalid_el01_32      // IRQ 32-bit EL0 or 1
  ventry  fiq_invalid_el01_32      // FIQ 32-bit EL0 or 1
  ventry  error_invalid_el01_32    // Error 32-bit EL0 or 1
 
sync_invalid_el2:
  handle_invalid_entry SYNC_INVALID_EL2
 
irq_invalid_el2:
  handle_invalid_entry IRQ_INVALID_EL2
 
fiq_invalid_el2:
  handle_invalid_entry FIQ_INVALID_EL2
 
error_invalid_el2:
  handle_invalid_entry ERROR_INVALID_EL2
 
fiq_invalid_el01_64:
  handle_invalid_entry FIQ_INVALID_EL01_64
 
error_invalid_el01_64:
  handle_invalid_entry ERROR_INVALID_EL01_64
 
sync_invalid_el01_32:
  handle_invalid_entry SYNC_INVALID_EL01_32
 
irq_invalid_el01_32:
  handle_invalid_entry IRQ_INVALID_EL01_32
 
fiq_invalid_el01_32:
  handle_invalid_entry FIQ_INVALID_EL01_32
 
error_invalid_el01_32:
  handle_invalid_entry ERROR_INVALID_EL01_32
 
 
el2_irq:
  kernel_entry
  bl  handle_irq
  kernel_exit
 
el01_irq:
  kernel_entry
  bl  handle_irq
  kernel_exit
 
el01_sync:
  kernel_entry
  mrs x0, esr_el2
  mrs x1, elr_el2
  mrs x2, far_el2
  mov x3, x8 // hvc number in x8
  bl handle_sync_exception
  kernel_exit
.macro  ventry  label
  .align  7
  b \label
  .endm

Regarding the green and blue blocks, the architecture gives EL1, EL2, and EL3 the ability to switch between two stack pointers: *SP_EL0* and *SP_ELx* (for example, SP_EL2 for EL2).

Privileged software uses this to switch between thread mode (SP_EL0) and handler mode (SP_ELx). Note that when privileged software is in thread mode, that is using SP_EL0, it’s not actually using the user-space stack; rather, the user-space stack pointer will have been saved off as part of the application’s context and the register value overwritten with the privileged software’s thread mode stack pointer.

The yellow and pink blocks are used to determine how much effort needs to be expended in context switching out the less privileged software. Note in the yellow block we say “one or more lower levels are 64-bit”; this is strictly different to saying that we came from a 64-bit level.

Consider a 64-bit hypervisor at EL2 hosting a 64-bit guest kernel at EL1, which is in turn hosting a 32-bit user-space application at EL0. Let’s say we are currently running in the application code and take a hypervisor scheduler tick interrupt to EL2; we’ll need to perform a full 64-bit context switch even though we came from a 32-bit context, so in this example we’d have taken the exception to the yellow vector block. In contrast, if the guest kernel itself was 32-bit then we’d have taken the exception to the pink vector block.


Debuggin Tips

This code shows how to perform the configuration of EL3 before passing control to the hypervisor in EL2:

globalfunc entry3
        // Install dummy vector table; each entry branches-to-self
        ADRP    x0, dummy_vectors
        MSR     VBAR_EL3, x0
 
        //
        // Configure SCR_EL3
        //
        //   10:10 RW       x1      make EL2 be 64-bit
        //   08:08 HCE      x1      enable HVC instructions
        //   05:04 RES1     x3      reserved
        //   00:00 NS       x1      switch to Normal world
        //
        MOV     w0, #0x531
        MSR     SCR_EL3, x0
 
        //
        // Configure SCTLR_EL2
        //
        //   29:28 RES1     x3      reserved
        //   23:22 RES1     x3      reserved
        //   18:18 RES1     x1      reserved
        //   16:16 RES1     x1      reserved
        //   12:12 I        x0      disable allocation of instrs into unified $s
        //   11:11 RES1     x1      reserved
        //   05:04 RES1     x3      reserved
        //   02:02 C        x0      disable allocation of data into data/unified $s
        //   00:00 M        x0      disable MMU
        //
        LDR     w0, =0x30C50830
        MSR     SCTLR_EL2, x0
 
        //
        // Prepare to drop to EL2h with all asynchronous exceptions masked
        //
        //   09:09 D        x1      Mask debug exceptions
        //   08:08 A        x1      Mask SErrors
        //   07:07 I        x1      Mask IRQs
        //   06:06 F        x1      Mask FIQs
        //   04:04 M[4]     x0      Bits 03:00 define an AArch64 state
        //   03:00 M[3:0]   x9      EL2h
        //
        MOV     w0, #0x3C9
        MSR     SPSR_EL3, x0
 
        // Drop to hypervisor code
        ADR     x0, entry2
        MSR     ELR_EL3, x0
        ERET
    endfunc entry3

We refer to the vector table installed here as a “dummy” because each entry is just a branch-to-self instruction:

Even though we’re not going to be doing anything in EL3, it’s good practice to install a dummy vector table so that any exceptions result in safely spinning and we can use a debugger to read out the syndrome registers to figure out what went wrong; otherwise we’d instead get thrown into a recursive exception and the syndrome register will be trampled.


🌱 Back to Garden