• Note that all ARM and Thumb instructions must be 16-bit (two bytes) aligned.

TL;DR:

The difference between two equivalent instructions is how they are fetched and interpreted prior to execution, not how they function.

Since the expansion from 16-bit to 32-bit instruction is accomplished via dedicated hardware within the chip, it does not slow execution.

However, the narrower 16-bit instructions offer memory advantages in terms of occupied space. Now let’s say that in our case the CPU is using ARM *“ARM” *(and not ARM Thumb) instructions, so we are working with instructions of **32bit **in size. Remember that the CPU can switch to and from Thumb mode at runtime.

  • The mode will be set according to the least significant bit (LSB) of the register.
  • The least significant bit (LSB) of function addresses is in fact always zero.
    • Even address means the target code is an ARM instruction.
    • Uneven address means it is Thumb.
    • The same is true when returning from a function (Address stored in $PC)
  • One practical consequence of ARM vs. Thumb addresses is that we often see references to “symbol+1

https://www.mathyvanhoef.com/2013/12/reversing-and-exploiting-arm-binaries.html


A small digression on the size of the instructions.

The Arm architecture supports three instruction sets: A64, A32 and T32.

  • The A64 and A32 instruction sets have fixed instruction lengths of 32-bits.
  • The T32 instruction set was introduced as a supplementary set of 16-bit instructions that supported improved code density for user code.

Over time, T32 evolved into a 16-bit and 32-bit mixed-length instruction set. As a result, the compiler can balance performance and code size trade-off in a single instruction set. ARM Developer


SWITCH MODES

The instructions bx, blx, and bjx can be used to switch processor mode from ARM to Thumb (or reverse).

There are two use cases:

  1. the target is a label or the target is a register [ARM]. In the case the target is a label, the instruction is of the form “blx addr”, and the mode is always switched. In case the target is specified using a register, e.g. “blx lr”, the mode will be set according to the least significant bit (LSB) of the register. Note that all ARM and Thumb instructions must be 16-bit (two bytes) aligned, thus the least significant bit (LSB) of function addresses is in fact always zero. An even address means the target code is an ARM instruction, an uneven address means it is Thumb code. For example, when the register contains the value 0x8001 the processor will switch to Thumb mode and start executing at 0x8000.
  2. The same is true when returning from a function: if the return address is even it will switch to ARM, if it is uneven it will switch to Thumb mode. Returning can be done using “pop {pc}” or “bx lr”. Both instructions will set the processor context accordingly. Note that branch instructions without an “X” in them do not change processor mode.

🌱 Back to Garden