Both traps from sensitive operations in VMs and hypercalls from the host kernel enter HYP mode through an exception on the CPU.
In the 64-bit Arm architecture, each entry in the vector table is 32 instructions long and the entry that is branched to is determined by both the type of the exception and from where the exception was taken


From “raspvisor” project:
.align 11
.globl vectors
vectors:
ventry sync_invalid_el2 // Synchronous EL2
ventry irq_invalid_el2 // IRQ EL2
ventry fiq_invalid_el2 // FIQ EL2
ventry error_invalid_el2 // Error EL2
ventry sync_invalid_el2 // Synchronous EL2
ventry el2_irq // IRQ EL2
ventry fiq_invalid_el2 // FIQ EL2
ventry error_invalid_el2 // Error EL2
ventry el01_sync // Synchronous 64-bit EL0 or 1
ventry el01_irq // IRQ 64-bit EL0 or 1
ventry fiq_invalid_el01_64 // FIQ 64-bit EL0 or 1
ventry error_invalid_el01_64 // Error 64-bit EL0 or 1
ventry sync_invalid_el01_32 // Synchronous 32-bit EL0 or 1
ventry irq_invalid_el01_32 // IRQ 32-bit EL0 or 1
ventry fiq_invalid_el01_32 // FIQ 32-bit EL0 or 1
ventry error_invalid_el01_32 // Error 32-bit EL0 or 1
sync_invalid_el2:
handle_invalid_entry SYNC_INVALID_EL2
irq_invalid_el2:
handle_invalid_entry IRQ_INVALID_EL2
fiq_invalid_el2:
handle_invalid_entry FIQ_INVALID_EL2
error_invalid_el2:
handle_invalid_entry ERROR_INVALID_EL2
fiq_invalid_el01_64:
handle_invalid_entry FIQ_INVALID_EL01_64
error_invalid_el01_64:
handle_invalid_entry ERROR_INVALID_EL01_64
sync_invalid_el01_32:
handle_invalid_entry SYNC_INVALID_EL01_32
irq_invalid_el01_32:
handle_invalid_entry IRQ_INVALID_EL01_32
fiq_invalid_el01_32:
handle_invalid_entry FIQ_INVALID_EL01_32
error_invalid_el01_32:
handle_invalid_entry ERROR_INVALID_EL01_32
el2_irq:
kernel_entry
bl handle_irq
kernel_exit
el01_irq:
kernel_entry
bl handle_irq
kernel_exit
el01_sync:
kernel_entry
mrs x0, esr_el2
mrs x1, elr_el2
mrs x2, far_el2
mov x3, x8 // hvc number in x8
bl handle_sync_exception
kernel_exit.macro ventry label
.align 7
b \label
.endmRegarding the green and blue blocks, the architecture gives EL1, EL2, and EL3 the ability to switch between two stack pointers: *SP_EL0* and *SP_ELx* (for example, SP_EL2 for EL2).
Privileged software uses this to switch between thread mode (SP_EL0) and handler mode (SP_ELx). Note that when privileged software is in thread mode, that is using SP_EL0, it’s not actually using the user-space stack; rather, the user-space stack pointer will have been saved off as part of the application’s context and the register value overwritten with the privileged software’s thread mode stack pointer.
The yellow and pink blocks are used to determine how much effort needs to be expended in context switching out the less privileged software. Note in the yellow block we say “one or more lower levels are 64-bit”; this is strictly different to saying that we came from a 64-bit level.
Consider a 64-bit hypervisor at EL2 hosting a 64-bit guest kernel at EL1, which is in turn hosting a 32-bit user-space application at EL0. Let’s say we are currently running in the application code and take a hypervisor scheduler tick interrupt to EL2; we’ll need to perform a full 64-bit context switch even though we came from a 32-bit context, so in this example we’d have taken the exception to the yellow vector block. In contrast, if the guest kernel itself was 32-bit then we’d have taken the exception to the pink vector block.
Debuggin Tips
This code shows how to perform the configuration of EL3 before passing control to the hypervisor in EL2:
globalfunc entry3
// Install dummy vector table; each entry branches-to-self
ADRP x0, dummy_vectors
MSR VBAR_EL3, x0
//
// Configure SCR_EL3
//
// 10:10 RW x1 make EL2 be 64-bit
// 08:08 HCE x1 enable HVC instructions
// 05:04 RES1 x3 reserved
// 00:00 NS x1 switch to Normal world
//
MOV w0, #0x531
MSR SCR_EL3, x0
//
// Configure SCTLR_EL2
//
// 29:28 RES1 x3 reserved
// 23:22 RES1 x3 reserved
// 18:18 RES1 x1 reserved
// 16:16 RES1 x1 reserved
// 12:12 I x0 disable allocation of instrs into unified $s
// 11:11 RES1 x1 reserved
// 05:04 RES1 x3 reserved
// 02:02 C x0 disable allocation of data into data/unified $s
// 00:00 M x0 disable MMU
//
LDR w0, =0x30C50830
MSR SCTLR_EL2, x0
//
// Prepare to drop to EL2h with all asynchronous exceptions masked
//
// 09:09 D x1 Mask debug exceptions
// 08:08 A x1 Mask SErrors
// 07:07 I x1 Mask IRQs
// 06:06 F x1 Mask FIQs
// 04:04 M[4] x0 Bits 03:00 define an AArch64 state
// 03:00 M[3:0] x9 EL2h
//
MOV w0, #0x3C9
MSR SPSR_EL3, x0
// Drop to hypervisor code
ADR x0, entry2
MSR ELR_EL3, x0
ERET
endfunc entry3We refer to the vector table installed here as a “dummy” because each entry is just a branch-to-self instruction:
Even though we’re not going to be doing anything in EL3, it’s good practice to install a dummy vector table so that any exceptions result in safely spinning and we can use a debugger to read out the syndrome registers to figure out what went wrong; otherwise we’d instead get thrown into a recursive exception and the syndrome register will be trampled.