Chapter 54: Virtualization Framework¶
Android Virtualization Framework (AVF) brings hardware-backed virtual machines to Android devices, enabling confidential computing workloads that are isolated even from the host operating system. Built on pKVM (protected KVM), crosvm, and Microdroid, AVF creates a complete ecosystem for running trusted code within protected virtual machines (pVMs). This chapter examines every layer of the stack -- from the EL2 hypervisor through the VM firmware, the Rust-based virtual machine monitor, the lightweight guest OS, and the userspace service architecture that ties it all together.
54.1 Android Virtualization Framework (AVF)¶
54.1.1 Overview and Motivation¶
The Android Virtualization Framework provides secure and private execution environments that go beyond the traditional Android app sandbox. While the app sandbox provides process-level isolation enforced by the Linux kernel, AVF provides hardware-enforced isolation through CPU virtualization extensions. A protected VM's memory is inaccessible even to a compromised Android host kernel.
The framework's README at packages/modules/Virtualization/README.md states the core
value proposition:
Android Virtualization Framework (AVF) provides secure and private execution environments for executing code. AVF is ideal for security-oriented use cases that require stronger isolation assurances over those offered by Android's app sandbox.
AVF targets several critical use cases:
-
Confidential computation -- Running machine learning models or sensitive algorithms where the code and data must not be observable by the host.
-
Trusted compilation -- The
composdservice uses AVF to compile ART artifacts inside a VM, ensuring the compiler itself has not been tampered with. -
Remote Key Provisioning -- The RKP VM handles cryptographic key operations in an isolated environment attested by a remote server.
-
Isolated services -- Third-party workloads that require strong guarantees about their execution environment.
54.1.2 High-Level Architecture¶
AVF is structured as a layered system with clear boundaries between components:
graph TB
subgraph "Host Android"
APP["Android App"]
VS["VirtualizationService"]
VM_CLI["vm CLI Tool"]
COMPOSD["composd"]
VIRTMGR["virtmgr"]
end
subgraph "Virtual Machine Monitor"
CROSVM["crosvm (Rust VMM)"]
end
subgraph "Hypervisor (EL2)"
PKVM["pKVM Hypervisor"]
end
subgraph "Protected VM"
PVMFW["pVM Firmware (pvmfw)"]
MICRODROID["Microdroid Guest OS"]
PAYLOAD["VM Payload"]
end
APP -->|"Java/AIDL API"| VS
VM_CLI -->|"Binder"| VS
COMPOSD -->|"Binder"| VS
VS --> VIRTMGR
VIRTMGR --> CROSVM
CROSVM -->|"KVM ioctls"| PKVM
PKVM -->|"loads"| PVMFW
PVMFW -->|"verifies & boots"| MICRODROID
MICRODROID -->|"runs"| PAYLOAD
54.1.3 The com.android.virt APEX¶
AVF is delivered as the com.android.virt APEX module, making it updatable
independently of the main Android platform. The APEX contains:
- The
vmcommand-line tool - The
VirtualizationServiceandvirtmgrdaemons - The Microdroid kernel and system images
- The
pvmfw.binfirmware binary - The
crosvmbinary - Java and native client libraries
- The
composdcompilation orchestration daemon
To install the APEX from source:
banchan com.android.virt aosp_arm64
UNBUNDLED_BUILD_SDKS_FROM_SOURCE=true m apps_only dist
adb install out/dist/com.android.virt.apex
adb reboot
54.1.4 Protected vs Non-Protected VMs¶
AVF supports two VM modes:
| Property | Non-Protected VM | Protected VM (pVM) |
|---|---|---|
| Memory isolation | Standard KVM isolation | pKVM-enforced: host cannot access guest memory |
| Firmware | No pvmfw | pvmfw validates guest before boot |
| DICE chain | Not available | Full DICE chain from ROM to payload |
| Remote attestation | Not supported | Supported via RKP VM |
| Cuttlefish support | Yes | No (requires hardware pKVM) |
| Debug support | Full | Limited (controlled by debug policy) |
The vm info command reports which modes a device supports:
From packages/modules/Virtualization/android/vm/src/main.rs, the info command
implementation queries device capabilities:
fn command_info(service: &dyn IVirtualizationService) -> Result<(), Error> {
let non_protected_vm_supported = hypervisor_props::is_vm_supported()?;
let protected_vm_supported = hypervisor_props::is_protected_vm_supported()?;
match (non_protected_vm_supported, protected_vm_supported) {
(false, false) => println!("VMs are not supported."),
(false, true) => println!("Only protected VMs are supported."),
(true, false) => println!("Only non-protected VMs are supported."),
(true, true) => println!("Both protected and non-protected VMs are supported."),
}
// ...
}
54.1.5 Supported Devices¶
As documented in packages/modules/Virtualization/docs/getting_started.md, AVF
supports:
- Pixel 7 / 7 Pro (
aosp_panther,aosp_cheetah) -- pKVM enabled by default - Pixel 6 / 6 Pro (
aosp_oriole,aosp_raven) -- pKVM requires explicit enable - Pixel Fold (
aosp_felix) - Pixel Tablet (
aosp_tangorpro) - Cuttlefish (
aosp_cf_x86_64_phone) -- Non-protected VMs only
For Pixel 6 devices, pKVM must be explicitly enabled:
54.1.6 DICE Attestation Chain¶
The Device Identifier Composition Engine (DICE) provides a cryptographic chain of trust from device ROM through each boot stage to the running VM payload. Each stage measures the next, creating a certificate chain that can prove the VM's identity.
graph LR
ROM["ROM (UDS)"] --> ABL["Android Bootloader"]
ABL --> PVMFW["pvmfw"]
PVMFW --> KERNEL["Microdroid Kernel"]
KERNEL --> OS["Microdroid OS"]
OS --> PAYLOAD["VM Payload"]
style ROM fill:#f96,stroke:#333
style ABL fill:#fc6,stroke:#333
style PVMFW fill:#ff6,stroke:#333
style KERNEL fill:#6f6,stroke:#333
style OS fill:#6cf,stroke:#333
style PAYLOAD fill:#96f,stroke:#333
As described in packages/modules/Virtualization/docs/pvm_dice_chain.md:
A VM DICE chain is a cryptographically linked certificates chain that captures measurements of the VM's entire execution environment.
This chain should be rooted in the device's ROM and encompass all components involved in the VM's loading and boot process.
Vendors construct the chain from ROM to ABL, then hand it off to pvmfw. The handover format is CBOR-encoded:
PvmfwDiceHandover = {
1 : bstr .size 32, ; CDI_Attest
2 : bstr .size 32, ; CDI_Seal
3 : DiceCertChain, ; Android DICE chain
}
The CDI (Compound Device Identifier) values serve two purposes:
- CDI_Attest -- Used to derive the attestation key pair for identity proofs
- CDI_Seal -- Used to derive sealing keys for encrypting persistent data
54.1.7 Remote Attestation¶
VM remote attestation allows a pVM to prove its trustworthiness to a third party. The
mechanism involves two stages as described in
packages/modules/Virtualization/docs/vm_remote_attestation.md:
-
RKP VM attestation -- The lightweight RKP VM is attested against the remote RKP server, which validates the DICE chain is rooted in a genuine device.
-
pVM attestation -- The now-trusted RKP VM validates the DICE chain of client pVMs, confirming they are running expected code in a genuine VM environment.
sequenceDiagram
participant pVM as Protected VM
participant RKP_VM as RKP VM
participant RKP_Server as RKP Server
Note over RKP_VM,RKP_Server: Phase 1: RKP VM Attestation
RKP_VM->>RKP_Server: Submit DICE chain
RKP_Server->>RKP_Server: Verify root public key in RKP DB
RKP_Server->>RKP_Server: Verify RKP VM markers in chain
RKP_Server-->>RKP_VM: Attestation certificate
Note over pVM,RKP_VM: Phase 2: pVM Attestation
pVM->>RKP_VM: Submit pVM DICE chain + challenge
RKP_VM->>RKP_VM: Validate pVM chain against own chain
RKP_VM-->>pVM: Signed attestation certificate + private key
The output of successful attestation includes a leaf certificate with a custom OID
extension (1.3.6.1.4.1.11129.2.1.29.1) that describes the VM payload:
AttestationExtension ::= SEQUENCE {
attestationChallenge OCTET_STRING,
isVmSecure BOOLEAN,
vmComponents SEQUENCE OF VmComponent,
}
54.1.8 Source Repository Structure¶
The AVF repository at packages/modules/Virtualization/ is organized as:
packages/modules/Virtualization/
android/
composd/ # Compilation orchestration service
virtualizationservice/ # Core VirtualizationService daemon
virtmgr/ # VM manager (per-VM process)
vm/ # vm CLI tool
MicrodroidDemoApp/ # Demo application
VmAttestationDemoApp/ # Attestation demo
fd_server/ # File descriptor server
build/
microdroid/ # Microdroid OS build files
guest/
pvmfw/ # pVM Firmware
service_vm/ # Service VM (RKP)
kernel/ # Microdroid kernel config
encryptedstore/ # Encrypted storage support
libs/
framework-virtualization/ # Java API
libvm_payload/ # VM Payload native API
libvmbase/ # Common VM base library
libvmclient/ # VM client library
libhypervisor_backends/ # Hypervisor abstraction
docs/ # Documentation
tests/ # Test suites
54.2 pKVM Hypervisor¶
54.2.1 Architecture Overview¶
pKVM (protected KVM) is a lightweight hypervisor that runs at ARM Exception Level 2 (EL2). It extends the standard Linux KVM to provide memory isolation guarantees that hold even if the host kernel is compromised. Unlike traditional hypervisors, pKVM is designed to have a minimal trusted computing base (TCB) -- it does not manage devices or schedule VMs; instead, it focuses exclusively on memory access control.
graph TB
subgraph "EL3 (Secure Monitor)"
TF_A["ARM Trusted Firmware"]
end
subgraph "EL2 (Hypervisor)"
PKVM_CORE["pKVM Core"]
S2PT["Stage-2 Page Tables"]
end
subgraph "EL1 (Host Kernel)"
HOST_KVM["KVM Host Driver"]
HOST_KERNEL["Linux Kernel"]
end
subgraph "EL1 (Guest)"
GUEST_OS["Guest Kernel"]
end
subgraph "EL0 (Host User)"
CROSVM_PROC["crosvm Process"]
end
subgraph "EL0 (Guest User)"
PAYLOAD_PROC["Payload Process"]
end
TF_A --> PKVM_CORE
PKVM_CORE --> S2PT
HOST_KVM -->|"HVC calls"| PKVM_CORE
S2PT -->|"controls"| HOST_KERNEL
S2PT -->|"controls"| GUEST_OS
HOST_KERNEL --> CROSVM_PROC
GUEST_OS --> PAYLOAD_PROC
54.2.2 Memory Isolation Model¶
The fundamental security property of pKVM is that a protected VM's memory is inaccessible to the host. This is enforced through ARM Stage-2 page tables controlled exclusively by the EL2 hypervisor:
-
Host memory -- Mapped in the host's Stage-2 tables, unmapped from all guest Stage-2 tables.
-
Guest memory -- Mapped in the guest's Stage-2 tables, unmapped from the host's Stage-2 tables. The host cannot read, write, or execute guest memory.
-
Shared memory -- Explicitly shared regions mapped in both host and guest Stage-2 tables. Used for virtio communication.
This design means that even a kernel-level exploit on the host cannot read a pVM's private memory. The hypervisor intercepts and validates all memory mapping operations.
54.2.3 pKVM Hypervisor Interface¶
The pvmfw documentation at packages/modules/Virtualization/guest/pvmfw/README.md
specifies the hypervisor calls available to guests:
Memory management:
MEMINFO(function ID0xc6000002) -- Query memory granule informationMEM_SHARE(function ID0xc6000003) -- Share a memory region with the hostMEM_UNSHARE(function ID0xc6000004) -- Revoke host access to a shared region
MMIO guard:
MMIO_GUARD_INFO(function ID0xc6000005) -- Query MMIO guard informationMMIO_GUARD_ENROLL(function ID0xc6000006) -- Enable MMIO guardingMMIO_GUARD_MAP(function ID0xc6000007) -- Map an MMIO regionMMIO_GUARD_UNMAP(function ID0xc6000008) -- Unmap an MMIO region
Standard ARM interfaces:
- ARM SMCCC v1.1 -- Calling convention
- PSCI v1.0 -- Power state coordination (reset, shutdown)
- TRNG v1.0 -- True random number generation
54.2.4 Stage-2 Page Table Management¶
When pKVM starts a protected VM, it creates a dedicated set of Stage-2 page tables. The key operations are:
sequenceDiagram
participant Host as Host Kernel
participant pKVM as pKVM (EL2)
participant S2 as Stage-2 Tables
Host->>pKVM: Create VM (KVM_CREATE_VM)
pKVM->>S2: Allocate guest Stage-2 tables
pKVM->>S2: Remove guest pages from host Stage-2
Note over pKVM,S2: Guest memory now invisible to host
Host->>pKVM: Map shared memory region
pKVM->>S2: Map region in both host and guest Stage-2
Note over pKVM,S2: Shared region for virtio transport
54.2.5 pvmfw Loading by pKVM¶
When the VMM requests a protected VM, pKVM loads pvmfw from a protected memory region into the guest's address space. This region was prepared by the Android Bootloader (ABL) and is described via a device tree reserved memory node:
reserved-memory {
pkvm_guest_firmware {
compatible = "linux,pkvm-guest-firmware-memory";
reg = <0x0 0x80000000 0x40000>;
no-map;
}
}
Key points about pvmfw loading:
-
The hypervisor does not interpret pvmfw -- it only protects and loads the pre-prepared binary.
-
The pvmfw binary must be 4KiB-aligned in guest address space.
- Configuration data is appended to pvmfw and included in the same protected region.
- Once loaded, pvmfw becomes the entry point of the VM, executing before any guest code.
54.2.6 Memory Sharing Protocol¶
For virtio communication, guest memory must be explicitly shared with the host. The sharing protocol uses hypercalls:
sequenceDiagram
participant Guest as Guest (pvmfw/kernel)
participant pKVM as pKVM Hypervisor
participant Host as Host (crosvm)
Guest->>pKVM: MEM_SHARE(page_addr)
pKVM->>pKVM: Map page in host Stage-2
pKVM-->>Guest: Success
Note over Guest,Host: Host can now access the shared page
Guest->>pKVM: MEM_UNSHARE(page_addr)
pKVM->>pKVM: Unmap page from host Stage-2
pKVM-->>Guest: Success
Note over Guest,Host: Host can no longer access the page
The guest is responsible for ensuring that sensitive data is never placed in shared memory regions. The pvmfw firmware handles initial memory sharing for the virtio transport before handing off to the guest kernel.
54.2.7 MMIO Guard¶
The MMIO Guard mechanism prevents the guest from accessing arbitrary MMIO regions. This is important because in a virtual machine, MMIO access is typically trapped by the hypervisor and forwarded to the VMM. A malicious VMM could present fake device responses. With MMIO Guard:
- The guest must explicitly enroll in MMIO guarding (
MMIO_GUARD_ENROLL). - Only mapped MMIO regions (
MMIO_GUARD_MAP) generate traps to the VMM. - Access to unmapped MMIO regions triggers an abort rather than a trap.
This limits the attack surface from a potentially compromised VMM.
54.3 crosvm: The Virtual Machine Monitor¶
54.3.1 Overview¶
crosvm is a Rust-based Virtual Machine Monitor (VMM) that originated in ChromiumOS and was adopted by Android for AVF. It manages the lifecycle of virtual machines, providing virtual hardware devices and acting as the interface between the host kernel and the guest.
The external/crosvm/ARCHITECTURE.md document describes the core design principles:
The principle characteristics of crosvm are:
- A process per virtual device, made using fork on Linux
- Each process is sandboxed using minijail
- Support for several CPU architectures, operating systems, and hypervisors
- Written in Rust for security and safety
54.3.2 Startup Sequence¶
A crosvm VM session follows a well-defined startup sequence, as documented in
external/crosvm/ARCHITECTURE.md:
graph TB
A["main.rs: Parse CLI args into Config"] --> B["run_config: Setup VM"]
B --> C["Load Linux kernel (ELF/bzImage)"]
C --> D["Create control sockets"]
D --> E["Arch::build_vm\n(aarch64/x86_64/riscv64)"]
E --> F["create_devices\n(PCI + virtio devices)"]
F --> G["Arch::assign_pci_addresses"]
G --> H["Arch::generate_pci_root\n(jail devices with minijail)"]
H --> I["RunnableLinuxVm\n(VCPUs + control loop)"]
I --> J["Run until shutdown"]
From external/crosvm/src/main.rs, the top-level run_vm function:
fn run_vm(cmd: RunCommand, log_config: LogConfig) -> Result<CommandStatus> {
let cfg = match TryInto::<Config>::try_into(cmd) {
Ok(cfg) => cfg,
Err(e) => {
eprintln!("{}", e);
return Err(anyhow!("{}", e));
}
};
// ...
let exit_state = crate::sys::run_config(cfg)?;
Ok(CommandStatus::from(exit_state))
}
54.3.3 Exit States¶
crosvm defines specific exit codes that distinguish between different VM termination
conditions, as defined in external/crosvm/src/main.rs:
#[repr(i32)]
enum CommandStatus {
/// Exit with success. Also used to mean VM stopped successfully.
SuccessOrVmStop = 0,
/// VM requested reset.
VmReset = 32,
/// VM crashed.
VmCrash = 33,
/// VM exit due to kernel panic in guest.
GuestPanic = 34,
/// Invalid argument was given to crosvm.
InvalidArgs = 35,
/// VM exit due to vcpu stall detection.
WatchdogReset = 36,
}
These exit codes allow virtmgr to determine why a VM terminated and report the
appropriate death reason to the VM owner.
54.3.4 Architecture Support¶
crosvm supports three CPU architectures, each with dedicated modules:
| Architecture | Source Directory | Key Components |
|---|---|---|
| AArch64 | external/crosvm/aarch64/src/ |
FDT generation, GIC setup, PSCI |
| x86_64 | external/crosvm/x86_64/src/ |
ACPI tables, CPUID, GDT, boot params |
| RISC-V 64 | external/crosvm/riscv64/src/ |
FDT generation, SBI interface |
Each architecture implements the Arch trait with these key methods:
build_vm()-- Create architecture-specific VM configurationassign_pci_addresses()-- Assign PCI bus addressesgenerate_pci_root()-- Build the PCI device tree
The x86_64 module contains additional components not needed on ARM:
external/crosvm/x86_64/src/
acpi.rs # ACPI table generation
bootparam.rs # Linux boot parameter structure
bzimage.rs # bzImage kernel loading
cpuid.rs # CPUID emulation
fdt.rs # Flattened Device Tree
gdb.rs # GDB stub for debugging
gdt.rs # Global Descriptor Table
interrupts.rs # Interrupt handling
mpspec.rs # Multiprocessor specification
54.3.5 Process-Per-Device Sandboxing¶
The most distinctive architectural feature of crosvm is its process-per-device model. Each virtual device runs in a separate forked process, sandboxed using minijail:
graph TB
subgraph "crosvm main process"
MAIN["Main Control Loop"]
VCPU1["VCPU 0 Thread"]
VCPU2["VCPU 1 Thread"]
end
subgraph "Device Processes (forked + sandboxed)"
BLK["Block Device\n(minijail)"]
NET["Net Device\n(minijail)"]
RNG["RNG Device\n(minijail)"]
CONSOLE["Console Device\n(minijail)"]
VSOCK["Vsock Device\n(minijail)"]
end
MAIN -->|"ProxyDevice"| BLK
MAIN -->|"ProxyDevice"| NET
MAIN -->|"ProxyDevice"| RNG
MAIN -->|"ProxyDevice"| CONSOLE
MAIN -->|"ProxyDevice"| VSOCK
VCPU1 -->|"Bus lookup"| MAIN
VCPU2 -->|"Bus lookup"| MAIN
As described in the architecture documentation:
During the device creation routine, each device will be created and then wrapped in a
ProxyDevicewhich will internallyfork(but notexec) and minijail the device, while dropping it for the main process. The only interaction that the device is capable of having with the main process is via the proxied trait methods ofBusDevice, shared memory mappings such as the guest memory, and file descriptors that were specifically allowed by that device's security policy.
54.3.6 Minijail Sandboxing¶
Each device process is sandboxed using minijail with Linux namespaces and seccomp filters. Seccomp policies are architecture-specific:
external/crosvm/jail/seccomp/
aarch64/ # ARM64 seccomp policies
arm/ # ARM32 seccomp policies
x86_64/ # x86_64 seccomp policies
riscv64/ # RISC-V seccomp policies
Each device has its own seccomp policy file that whitelists only the syscalls it
needs. The policy files include a common base (common_device.policy) and add
device-specific syscalls.
The sandboxing provides defense in depth: even if a malicious guest compromises a virtual device process, the attacker is confined to a minimal syscall set within an isolated namespace.
54.3.7 Hypervisor Abstraction Layer¶
crosvm supports multiple hypervisor backends through an abstraction layer:
external/crosvm/hypervisor/src/
lib.rs # Trait definitions
kvm/ # Linux KVM backend
geniezone/ # MediaTek GenieZone
gunyah/ # Qualcomm Gunyah
halla/ # (development backend)
haxm/ # Intel HAXM (for Windows)
whpx/ # Windows Hypervisor Platform
On Android, the primary backend is KVM (including pKVM for protected VMs). The
hypervisor module in external/crosvm/hypervisor/src/ provides:
hypervisor/src/
aarch64.rs # ARM64-specific hypervisor traits
x86_64.rs # x86_64-specific hypervisor traits
riscv64.rs # RISC-V specific hypervisor traits
caps.rs # Capability detection
54.3.8 Device Model¶
The crosvm device model is built on a hierarchy of traits:
classDiagram
class BusDevice {
<<trait>>
+read(offset, data)
+write(offset, data)
}
class PciDevice {
<<trait>>
+config_space_read()
+config_space_write()
+preferred_address()
}
class VirtioDevice {
<<trait>>
+device_type()
+queue_max_sizes()
+features()
+activate(memory, interrupt, queues)
}
class VirtioPciDevice {
-virtio_device: VirtioDevice
}
class ProxyDevice {
-child_pid: pid_t
}
BusDevice <|-- PciDevice : "blanket impl"
PciDevice <|.. VirtioPciDevice
VirtioDevice <|.. VirtioPciDevice : "wraps"
BusDevice <|.. ProxyDevice : "proxies via fork"
As the ARCHITECTURE.md explains:
The root of the crosvm device model is the
Busstructure and its friend theBusDevicetrait. TheBusstructure is a virtual computer bus used to emulate the memory-mapped I/O bus and also I/O ports for x86 VMs.
The virtio device implementations include:
| Device | Source File | Purpose |
|---|---|---|
| Block | devices/src/virtio/block/ |
Disk I/O |
| Net | devices/src/virtio/net.rs |
Network I/O |
| Console | devices/src/virtio/console/ |
Serial console |
| RNG | devices/src/virtio/rng.rs |
Random number generation |
| Vsock | devices/src/virtio/vsock/ |
Host-guest socket communication |
| Balloon | devices/src/virtio/balloon.rs |
Memory ballooning |
| SCSI | devices/src/virtio/scsi/ |
SCSI device emulation |
| Sound | devices/src/virtio/snd/ |
Audio device |
| GPU | devices/src/virtio/gpu/ |
Graphics rendering |
| IOMMU | devices/src/virtio/iommu.rs |
I/O memory management |
| Pmem | devices/src/virtio/pmem.rs |
Persistent memory |
| Filesystem | devices/src/virtio/fs/ |
Shared filesystem (virtio-fs) |
| TPM | devices/src/virtio/tpm.rs |
Trusted Platform Module |
54.3.9 GuestMemory Architecture¶
Guest memory management is a critical subsystem. The ARCHITECTURE.md describes five related types:
-
GuestMemory-- Reference to all guest memory. Can be cloned, but the underlying memory is always the same. Implemented usingMemoryMappingandSharedMemory. For non-protected VMs, it is mapped into host address space but is non-contiguous. -
SharedMemory-- Wraps amemfd. Can be mapped usingMemoryMapping. Cannot be cloned. -
VolatileMemory-- Trait for generic access to non-contiguous memory.GuestMemoryimplements this trait. -
VolatileSlice-- Analogous to a Rust slice but with asynchronously changing data. Useful for scatter-gather table entries. -
MemoryMapping-- Safe wrapper aroundmmap/munmap. Provides RAII semantics. Access via Rust references is forbidden; useVolatileSlice.
For protected VMs, guest memory is NOT mapped into host address space -- the pKVM hypervisor prevents this. Shared memory regions for virtio transport are the exception.
54.3.10 VM Control Sockets¶
crosvm uses Unix domain sockets for inter-process communication between the main process and device processes. From the architecture doc:
For the operations that devices need to perform on the global VM state, such as mapping into guest memory address space, there are the VM control sockets. There are a few kinds, split by the type of request and response that the socket will process. This also provides basic security privilege separation in case a device becomes compromised by a malicious guest.
The control socket types handle:
- Memory mapping requests
- MSI route allocation
- Guest memory registration/deregistration
- VM state changes (pause, resume, reset)
External control is available via the --socket argument, accessed through
the crosvm_control library or CLI subcommands like crosvm stop.
54.3.11 WaitContext Event Loop¶
Most crosvm threads use a WaitContext for their event loop. This is a
cross-platform abstraction over epoll (Linux) and WaitForMultipleObjects
(Windows):
// Conceptual event loop (simplified)
#[derive(EventToken)]
enum Token {
VirtioQueue,
InterruptResample,
Kill,
}
let wait_ctx = WaitContext::new()?;
wait_ctx.add(&queue_evt, Token::VirtioQueue)?;
wait_ctx.add(&interrupt_resample, Token::InterruptResample)?;
wait_ctx.add(&kill_evt, Token::Kill)?;
loop {
let events = wait_ctx.wait()?;
for event in events {
match event.token {
Token::VirtioQueue => { /* process queue */ },
Token::InterruptResample => { /* resample interrupt */ },
Token::Kill => return Ok(()),
}
}
}
54.3.12 Code Organization¶
The crosvm codebase is organized into Rust crates, as documented in
external/crosvm/ARCHITECTURE.md:
external/crosvm/
src/ # Top-level binary frontend
aarch64/ # ARM64 architecture support
x86_64/ # x86_64 architecture support
riscv64/ # RISC-V 64 architecture support
base/ # Cross-platform safe wrappers
cros_async/ # Async runtime (io_uring + epoll)
devices/ # Virtual device implementations
disk/ # Disk image manipulation (raw, qcow)
hypervisor/ # Hypervisor abstraction layer
jail/ # Minijail sandboxing helpers
jail/seccomp/ # Per-architecture seccomp policies
kernel_loader/ # Kernel image loading
kvm_sys/ # KVM ioctl structures
kvm/ # KVM wrapper
net_util/ # TUN/TAP device creation
sync/ # Custom Mutex/Condvar
vfio_sys/ # VFIO structures for device passthrough
vhost/ # Vhost device wrappers
virtio_sys/ # Virtio kernel interface
vm_control/ # VM IPC definitions
vm_memory/ # VM memory objects
54.4 Microdroid¶
54.4.1 Overview¶
Microdroid is a minimal Android distribution designed specifically for running inside
AVF virtual machines. As described in packages/modules/Virtualization/build/microdroid/README.md:
Microdroid is a (very) lightweight version of Android that is intended to run on on-device virtual machines. It is built from the same source code as the regular Android, but it is much smaller; no system server, no HALs, no GUI, etc. It is intended to host headless & native workloads only.
54.4.2 What Microdroid Removes¶
Compared to full Android, Microdroid strips away nearly everything:
| Component | Full Android | Microdroid |
|---|---|---|
| System Server | Yes | No |
| Hardware Abstraction Layers | Full suite | None |
| GUI/SurfaceFlinger | Yes | No |
| Package Manager | Yes | No |
| Telephony | Yes | No |
| Bluetooth | Yes | No |
| WiFi stack | Yes | No |
| Camera | Yes | No |
| Audio service | Yes | No |
| SELinux policy | Full | Minimal |
| Init scripts | Hundreds | One (init.rc) |
What Microdroid retains:
- Linux kernel
- Bionic libc
- Init process (minimal configuration)
- APEX daemon (in VM mode)
microdroid_manager(payload orchestration)- Tombstoned (crash reporting)
- Basic filesystem support
54.4.3 VM Configuration¶
Microdroid VMs are configured through JSON files. The base configuration from
packages/modules/Virtualization/build/microdroid/microdroid.json:
{
"kernel": "/apex/com.android.virt/etc/fs/microdroid_kernel",
"disks": [
{
"partitions": [
{
"label": "vbmeta_a",
"path": "/apex/com.android.virt/etc/fs/microdroid_vbmeta.img"
},
{
"label": "super",
"path": "/apex/com.android.virt/etc/fs/microdroid_super.img"
}
],
"writable": false
}
],
"memory_mib": 256,
"console_input_device": "hvc0",
"platform_version": "~1.0"
}
The configuration specifies:
- Kernel -- Path to the Microdroid kernel binary
-
Disks -- Disk images including vbmeta (for verified boot) and super (the system partition in Android's dynamic partitions format)
-
Memory -- 256 MiB default allocation
- Console --
hvc0for virtio console I/O
54.4.4 Boot Process¶
The Microdroid boot process is tightly controlled:
sequenceDiagram
participant PVMFW as pvmfw
participant KERNEL as Microdroid Kernel
participant INIT as init
participant APEXD as apexd-vm
participant MM as microdroid_manager
participant PAYLOAD as VM Payload
PVMFW->>KERNEL: Verify and boot kernel
KERNEL->>INIT: Start init process
INIT->>INIT: Mount cgroups
INIT->>INIT: Start ueventd
INIT->>INIT: Apply debug policy
INIT->>MM: Start microdroid_manager
MM->>MM: Setup APK verification
MM->>APEXD: Start apexd in VM mode
APEXD-->>INIT: apexd.status = ready
INIT->>INIT: perform_apex_config
INIT->>INIT: Set apex_config.done = true
MM->>MM: Setup payload config
MM->>INIT: Set microdroid_manager.config_done = 1
INIT->>INIT: Mount /data (tmpfs, 128MB)
INIT->>INIT: Set dev.bootcomplete = 1
MM->>PAYLOAD: Launch payload (.so)
PAYLOAD->>PAYLOAD: AVmPayload_main()
The init.rc from packages/modules/Virtualization/build/microdroid/init.rc reveals
the boot orchestration:
on init
mkdir /mnt/apk 0755 root root
mkdir /mnt/extra-apk 0755 root root
mkdir /mnt/tenant-apk 0755 root root
# Microdroid_manager starts apkdmverity/zipfuse/apexd
start microdroid_manager
# Wait for apexd to finish activating APEXes
wait_for_prop apexd.status ready
perform_apex_config
# Notify microdroid_manager that APEX config is done
setprop apex_config.done true
54.4.5 Filesystem Layout¶
Microdroid uses a minimal filesystem layout from
packages/modules/Virtualization/build/microdroid/fstab.microdroid:
system /system ext4 noatime,ro,errors=panic wait,slotselect,avb=vbmeta,first_stage_mount,logical
/dev/block/by-name/microdroid-vendor /vendor ext4 noatime,ro,errors=panic wait,first_stage_mount,avb_hashtree_digest=/proc/device-tree/avf/vendor_hashtree_descriptor_root_digest
Key filesystem characteristics:
- Root -- Read-only, remounted after post-fs
- /system -- Read-only, verified boot via AVB
- /vendor -- Optional, verified via hashtree digest
- /data -- tmpfs (128 MiB), ephemeral
- /mnt/apk -- Mount point for payload APK
- /mnt/encryptedstore -- Encrypted persistent storage
54.4.6 Vendor Image Support¶
Microdroid supports optional vendor partitions for device-specific modules. The vendor image verification process differs between protected and non-protected VMs:
Non-protected VM:
The virtualizationmanager creates a DTBO containing the vendor hashtree digest
and passes it to the VM via crosvm. The digest is obtained from the host Android
device tree under /avf/reference/.
Protected VM: The VM reference DT included in the pvmfw configuration data is used for additional validation. The bootloader appends the vendor hashtree digest into the VM reference DT. pvmfw validates that if a matching property is present in the VM's device tree, its value exactly matches the reference.
From the Microdroid README:
For pVM, VM reference DT included in pvmfw config data is additionally used for validating vendor hashtree digest. Bootloader should append vendor hashtree digest into VM reference DT based on fstab.microdroid.
54.4.7 VM Payload API¶
The VM Payload API provides the interface for code running inside a Microdroid VM.
It is a C API defined in packages/modules/Virtualization/libs/libvm_payload/:
// Entry point for VM payload code
extern "C" int AVmPayload_main() {
printf("Hello Microdroid!\n");
// Use VM Payload APIs here
}
Available APIs include:
AVmPayload_requestAttestation()-- Request remote attestationAVmPayload_runVsockRpcServer()-- Host a binder server over vsock- Secret derivation and sealing functions
- NDK subset: libc, logging, NdkBinder
Building a VM payload requires two build modules:
// The payload shared library
cc_library_shared {
name: "MyMicrodroidPayload",
srcs: ["**/*.cpp"],
sdk_version: "current",
}
// The host app that contains the payload
android_app {
name: "MyApp",
srcs: ["**/*.java"],
jni_libs: ["MyMicrodroidPayload"],
use_embedded_native_libs: true,
sdk_version: "current",
}
54.4.8 Platform Prerequisites¶
Microdroid requires:
- 64-bit target -- Either x86_64 or arm64. 32-bit is not supported.
- com.android.virt APEX -- Must be pre-installed on the device.
- KVM support --
/dev/kvmmust exist. - For protected VMs -- pKVM hypervisor must be active.
The APEX can be added to a product by including in the product makefile:
54.4.9 Encrypted Storage¶
Microdroid supports encrypted persistent storage for VMs that need to preserve
data across reboots. The encrypted store is backed by a file on the host and
mounted at /mnt/encryptedstore inside the VM.
From the init.rc:
on property:microdroid_manager.encrypted_store.status=mounted
restorecon /mnt/encryptedstore
# Performance tuning for storage
write /proc/sys/vm/compaction_proactiveness 0
write /sys/module/dm_verity/parameters/prefetch_cluster 0
write /proc/sys/vm/swappiness 100
setprop microdroid_manager.encrypted_store.status ready
The encryption keys are derived from the VM's DICE chain, ensuring that only the same VM instance (with the same code and configuration) can decrypt the data.
54.5 pVM Firmware¶
54.5.1 Purpose and Threat Model¶
The pVM firmware (pvmfw) is the first code that executes inside a protected VM. It serves as the root of trust for the VM, validating the guest environment before allowing any guest code to run.
From packages/modules/Virtualization/guest/pvmfw/README.md:
As pVMs are managed by a VMM running on the untrusted host, the virtual machine it configures can't be trusted either. Furthermore, even though the isolation mentioned above allows pVMs to protect their secrets from the host, it does not help with provisioning them during boot. In particular, the threat model would prohibit the host from ever having access to those secrets, preventing the VMM from passing them to the pVM.
The threat model assumes:
- The host OS may be fully compromised
- The VMM (crosvm) may be malicious
- The hypervisor (pKVM) and pvmfw itself are trusted
- Device hardware (including firmware up to pvmfw loading) is trusted
54.5.2 Source Architecture¶
The pvmfw source code is at packages/modules/Virtualization/guest/pvmfw/src/ and
is a no_std Rust binary:
// packages/modules/Virtualization/guest/pvmfw/src/main.rs
#![no_main]
#![no_std]
extern crate alloc;
mod arch;
mod bootargs;
mod config;
mod device_assignment;
mod dice;
mod entry;
mod fdt;
mod gpt;
mod instance;
mod memory;
mod rollback;
The no_std constraint means pvmfw operates without a standard library -- it has
no heap allocator by default (it uses a configured one), no filesystem, and no
operating system services. This minimizes the trusted computing base.
54.5.3 Entry Point and Boot Flow¶
The entry point in packages/modules/Virtualization/guest/pvmfw/src/entry.rs defines
the boot arguments and initialization sequence:
pub struct BootArgs {
/// Address of FDT
pub fdt: Option<usize>,
/// Address of first byte in payload image
pub payload_start: Option<usize>,
/// Size of payload in bytes
pub payload_size: Option<usize>,
/// Address of Linux x86 boot params structure
pub boot_params: Option<usize>,
}
Platform-specific argument parsing handles the differences between AArch64 and x86_64:
pub fn from_vmbase_args(argv: &[usize]) -> Self {
cfg_if::cfg_if! {
if #[cfg(target_arch = "aarch64")] {
Self {
fdt: argv.first().copied(),
payload_start: argv.get(1).copied(),
payload_size: argv.get(2).copied(),
boot_params: None,
}
} else if #[cfg(target_arch = "x86_64")] {
Self {
fdt: None,
payload_start: None,
payload_size: None,
boot_params: argv.get(1).copied(),
}
}
}
}
54.5.4 Main Verification Flow¶
The main function in packages/modules/Virtualization/guest/pvmfw/src/main.rs
orchestrates the complete verification process:
graph TB
START["pvmfw entry"] --> PARSE_DICE["Parse DICE handover"]
PARSE_DICE --> CHECK_DEBUG["Check debug policy consistency"]
CHECK_DEBUG --> VERIFY_BOOT["Verify guest kernel (AVB)"]
VERIFY_BOOT --> SANITIZE_DT["Sanitize device tree"]
SANITIZE_DT --> PARSE_RESMEM["Parse reserved memory"]
PARSE_RESMEM --> ROLLBACK["Perform rollback protection"]
ROLLBACK --> DICE_DERIVE["Derive next-stage DICE secrets"]
DICE_DERIVE --> KASLR["Generate KASLR seed"]
KASLR --> MODIFY_FDT["Modify FDT for next stage"]
MODIFY_FDT --> UNSHARE["Unshare memory from host"]
UNSHARE --> JUMP["Jump to guest kernel"]
The core main function signature from the source:
fn main<'a>(
untrusted_fdt: &mut Fdt,
signed_kernel: &[u8],
ramdisk: Option<&[u8]>,
current_dice_handover: Option<&[u8]>,
mut debug_policy: Option<&[u8]>,
vm_dtbo: Option<&mut [u8]>,
vm_ref_dt: Option<&[u8]>,
reserved_mem: Option<&[u8]>,
) -> Result<(&'a [u8], bool), RebootReason> {
info!("pVM firmware");
// ...
}
54.5.5 Verified Boot¶
pvmfw uses Android Verified Boot (AVB) to verify the guest kernel and optional ramdisk. The verification uses an embedded public key:
/// Trusted public key, used during verification of the signed kernel & ramdisk.
const PUBLIC_KEY: &[u8] = include_bytes!(
concat!(env!("OUT_DIR"), "/pvmfw_embedded_key_pub.bin")
);
The verified boot process:
fn perform_verified_boot<'a>(
signed_kernel: &[u8],
ramdisk: Option<&[u8]>,
) -> Result<(VerifiedBootData<'a>, bool, usize), RebootReason> {
let verified_boot_data = verify_payload(signed_kernel, ramdisk, PUBLIC_KEY)
.map_err(|e| {
error!("Failed to verify the payload: {e}");
RebootReason::PayloadVerificationError
})?;
let debuggable = verified_boot_data.debug_level != DebugLevel::None;
let guest_page_size = verified_boot_data.page_size.unwrap_or(SIZE_4KB);
Ok((verified_boot_data, debuggable, guest_page_size))
}
54.5.6 DICE Derivation¶
After verification, pvmfw derives the next-stage DICE secrets. The DICE module at
packages/modules/Virtualization/guest/pvmfw/src/dice/mod.rs handles this:
// DICE Configuration Descriptor keys
const COMPONENT_NAME_KEY: i64 = -70002;
const SECURITY_VERSION_KEY: i64 = -70005;
const RKP_VM_MARKER_KEY: i64 = -70006;
const INSTANCE_HASH_KEY: i64 = -71003;
The derivation process:
- Parse the incoming DICE handover (CDIs + certificate chain)
- Compute partial DICE inputs from verified boot data
- Incorporate the instance hash (for per-VM differentiation)
- Perform rollback protection
- Derive the next-stage CDIs and certificate
fn perform_dice_derivation(
dice_handover_bytes: &[u8],
dice_context: DiceContext,
dice_inputs: PartialInputs,
salt: &[u8; HIDDEN_SIZE],
defer_rollback_protection: bool,
next_dice_handover: &mut [u8],
) -> Result<(), RebootReason> {
dice_inputs
.write_next_handover(
dice_handover_bytes.as_ref(),
salt,
defer_rollback_protection,
next_dice_handover,
dice_context,
)
.map_err(|e| {
error!("Failed to derive next-stage DICE secrets: {e:?}");
RebootReason::SecretDerivationError
})?;
Ok(())
}
The instance-specific salt ensures that different VM instances with identical payloads receive different secrets:
fn salt_from_instance_id(fdt: &Fdt) -> Result<Option<Hidden>, RebootReason> {
let Some(id) = read_instance_id(fdt).map_err(|e| {
error!("Failed to get instance-id in DT: {e}");
RebootReason::InvalidFdt
})?
else {
return Ok(None);
};
let salt = Digester::sha512()
.digest(&[&b"InstanceId:"[..], id].concat())
// ...
Ok(Some(salt))
}
54.5.7 Reboot Reasons¶
pvmfw defines specific reboot reasons that help diagnose boot failures. From
packages/modules/Virtualization/guest/pvmfw/src/entry.rs:
pub enum RebootReason {
InvalidDiceHandover, // "PVM_FIRMWARE_INVALID_DICE_HANDOVER"
InvalidConfig, // "PVM_FIRMWARE_INVALID_CONFIG_DATA"
InternalError, // "PVM_FIRMWARE_INTERNAL_ERROR"
InvalidFdt, // "PVM_FIRMWARE_INVALID_FDT"
InvalidPayload, // "PVM_FIRMWARE_INVALID_PAYLOAD"
InvalidRamdisk, // "PVM_FIRMWARE_INVALID_RAMDISK"
PayloadVerificationError, // "PVM_FIRMWARE_PAYLOAD_VERIFICATION_FAILED"
SecretDerivationError, // "PVM_FIRMWARE_SECRET_DERIVATION_FAILED"
}
Each reason is written to a dedicated console before reboot:
const REBOOT_REASON_CONSOLE: usize = 1;
console_writeln!(REBOOT_REASON_CONSOLE, "{}", reboot_reason.as_avf_reboot_string())
.unwrap();
reboot()
54.5.8 Configuration Data Format¶
pvmfw receives configuration data appended to its binary by the bootloader.
The configuration uses a versioned header format from
packages/modules/Virtualization/guest/pvmfw/src/config/mod.rs:
#[repr(C, packed)]
#[derive(Clone, Copy, Debug, FromBytes, Immutable, KnownLayout)]
struct Header {
/// Magic number; must be `Header::MAGIC`.
magic: u32,
/// Version of the header format.
version: Version,
/// Total size of the configuration data.
total_size: u32,
/// Feature flags; currently reserved and must be zero.
flags: u32,
}
The configuration data memory layout:
+===============================+
| pvmfw.bin |
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
| (Padding to 4KiB alignment) |
+===============================+ <-- HEAD
| Magic (= 0x666d7670) |
+-------------------------------+
| Version |
+-------------------------------+
| Total Size = (TAIL - HEAD) |
+-------------------------------+
| Flags |
+-------------------------------+
| Entry 0: DICE chain |
| Entry 1: Debug Policy |
| Entry 2: VM DTBO (v1.1) |
| Entry 3: VM ref DT (v1.2) |
| Entry 4: Reserved Mem (v1.3)|
+-------------------------------+
| Blob data follows... |
+===============================+ <-- TAIL
54.5.9 Configuration Versions¶
The configuration format has evolved across four versions:
Version 1.0:
- Entry 0: DICE chain handover (mandatory)
- Entry 1: Debug policy DTBO (optional)
Version 1.1:
- Entry 2: VM Device Assignment DTBO (optional, for device passthrough)
Version 1.2:
- Entry 3: VM reference DT (optional, for secure property passing)
Version 1.3:
- Entry 4: Reserved memory (optional, for confidential data to specific guests)
Each blob is referred to by offset and size in the entry array. Missing optional entries are denoted by zero size.
54.5.10 VBMeta Properties¶
AVF defines special AVB VBMeta descriptor properties that pvmfw recognizes:
com.android.virt.cap-- Capabilities list (pipe-separated):remote_attest-- Hard-coded rollback protection indexsecretkeeper_protection-- Defers rollback protection to guestsupports_uefi_boot-- Boots VM as EFI payload (experimental)trusty_security_vm-- Skips rollback protectioncom.android.virt.page_size-- Guest page size in KiB (default: 4)com.android.virt.name-- VM name, used in DICE certificate:"rkp_vm"-- Reserved for Remote Key Provisioning VM"desktop-trusty"-- Reserved for Trusty desktop TEE VM
54.5.11 Handover to Guest Kernel¶
After all verification and derivation is complete, pvmfw prepares the guest environment and jumps to the kernel:
- Unshare all non-essential memory from the host
- Unshare all MMIO regions except UART (if debuggable)
- Flush preserved memory (DICE handover, reserved memory)
- Compute the kernel entry point
- Jump to the payload
The DICE chain is passed to the guest via a device tree reserved-memory node:
/ {
reserved-memory {
dice {
compatible = "google,open-dice";
no-map;
reg = <0x0 0x7fe0000>, <0x0 0x1000>;
};
};
};
54.5.12 Memory Layout¶
pvmfw operates within a fixed memory layout defined by the crosvm protected VM configuration:
| Address | Size | Purpose |
|---|---|---|
0x7fc0_0000 |
Variable | pvmfw binary + config data |
0x7fe0_0000 |
2 MiB | Scratch memory |
0x3f8 |
MMIO | 16550 UART for logging |
| PCI bus | MMIO | virtio devices |
54.5.13 Development Workflow¶
For rapid iteration, pvmfw can be built and pushed without reflashing the device partition:
m pvmfw-tool pvmfw_bin
PVMFW_BIN=${ANDROID_PRODUCT_OUT}/system/etc/pvmfw.bin
DICE=${ANDROID_BUILD_TOP}/packages/modules/Virtualization/tests/pvmfw/assets/dice.dat
# Create pvmfw with test DICE chain
pvmfw-tool custom_pvmfw ${PVMFW_BIN} ${DICE}
# Push to device and set system property
adb push custom_pvmfw /data/local/tmp/pvmfw
adb root
adb shell setprop hypervisor.pvmfw.path /data/local/tmp/pvmfw
# Run a protected VM with the custom pvmfw
adb shell /apex/com.android.virt/bin/vm run-microdroid --protected
To run without pvmfw entirely (for debugging early boot issues):
54.6 VM Service Architecture¶
54.6.1 Service Overview¶
The AVF userspace service architecture consists of several cooperating components that manage VM lifecycle, security, and communication:
graph TB
subgraph "System Services"
VS["VirtualizationService\n(android.system.virtualizationservice)"]
MAINT["VirtualizationMaintenance"]
RPC["RemotelyProvisionedComponent\n(avf)"]
end
subgraph "Per-VM Processes"
VIRTMGR["virtmgr\n(VirtualizationService per-VM)"]
CROSVM["crosvm\n(VM process)"]
FD_SERVER["fd_server"]
end
subgraph "Client Tools"
VM_CLI["vm CLI"]
COMPOSD["composd"]
APP["Android App"]
end
subgraph "HAL Services"
CAPS["IVmCapabilitiesService"]
end
APP -->|"Java API"| VS
VM_CLI -->|"Binder"| VS
COMPOSD -->|"Binder"| VS
VS -->|"spawn"| VIRTMGR
VIRTMGR -->|"fork+exec"| CROSVM
VIRTMGR -->|"spawn"| FD_SERVER
VS -->|"Binder"| CAPS
VS --> MAINT
VS --> RPC
54.6.2 VirtualizationService¶
The VirtualizationService is the central daemon that manages global VM resources.
From packages/modules/Virtualization/android/virtualizationservice/src/main.rs:
fn try_main() -> Result<()> {
// ...
ProcessState::start_thread_pool();
let service = VirtualizationServiceInternal::init();
let internal_service =
BnVirtualizationServiceInternal::new_binder(
service.clone(), BinderFeatures::default()
);
register(INTERNAL_SERVICE_NAME, internal_service)?;
if is_remote_provisioning_hal_declared().unwrap_or(false) {
let remote_provisioning_service = remote_provisioning::new_binder();
register(REMOTELY_PROVISIONED_COMPONENT_SERVICE_NAME,
remote_provisioning_service)?;
}
if cfg!(llpvm_changes) {
let maintenance_service =
BnVirtualizationMaintenance::new_binder(
service.clone(), BinderFeatures::default()
);
register(MAINTENANCE_SERVICE_NAME, maintenance_service)?;
}
ProcessState::join_thread_pool();
// ...
}
The service registers up to three Binder interfaces:
android.system.virtualizationservice-- The internal API for VM management-
android.hardware.security.keymint.IRemotelyProvisionedComponent/avf-- Remote key provisioning (if declared) -
android.system.virtualizationmaintenance-- VM maintenance operations
54.6.3 Global State Management¶
The VirtualizationServiceInternal singleton manages globally-unique resources:
pub struct VirtualizationServiceInternal {
state: Arc<Mutex<GlobalState>>,
display_service_set: Arc<Condvar>,
shutdown_monitor: Arc<Mutex<ShutdownMonitor>>,
}
Key managed resources include:
- CID allocation -- Each VM receives a unique vsock CID in the range 2048-65535:
-
Temporary directories -- Per-VM working directories under
/data/misc/virtualizationservice/ -
Tombstone receiver -- Collects crash dumps from VMs
- Display service -- Optional display forwarding
54.6.4 AIDL Interface¶
The VirtualizationService exposes a rich AIDL interface. The key types from
packages/modules/Virtualization/android/virtmgr/src/aidl.rs:
// VM configuration types
pub use VirtualMachineConfig::VirtualMachineConfig;
pub use VirtualMachineAppConfig::VirtualMachineAppConfig;
pub use VirtualMachineRawConfig::VirtualMachineRawConfig;
pub use VirtualMachineState::VirtualMachineState;
// VM lifecycle
pub use IVirtualMachine::IVirtualMachine;
pub use IVirtualMachineCallback::IVirtualMachineCallback;
pub use IVirtualizationService::IVirtualizationService;
// Security
pub use ISecretkeeper::ISecretkeeper;
pub use IAuthGraphKeyExchange::IAuthGraphKeyExchange;
pub use Certificate::Certificate;
54.6.5 VM Lifecycle¶
A VM goes through a well-defined lifecycle managed by the service:
stateDiagram-v2
[*] --> NOT_STARTED: createVm
NOT_STARTED --> STARTING: start
STARTING --> STARTED: crosvm running
STARTED --> READY: payload ready callback
READY --> FINISHED: payload exits normally
READY --> DEAD: crash / kill
STARTED --> DEAD: crash / kill
STARTING --> DEAD: boot failure
FINISHED --> [*]
DEAD --> [*]
VM states from the AIDL definition:
fn state_to_str(vm_state: VirtualMachineState) -> &'static str {
match vm_state {
VirtualMachineState::NOT_STARTED => "NOT_STARTED",
VirtualMachineState::STARTING => "STARTING",
VirtualMachineState::STARTED => "STARTED",
VirtualMachineState::READY => "READY",
VirtualMachineState::FINISHED => "FINISHED",
VirtualMachineState::DEAD => "DEAD",
_ => "(invalid state)",
}
}
54.6.6 VM Creation Flow¶
The complete flow of creating and starting a VM:
sequenceDiagram
participant App as Android App
participant VS as VirtualizationService
participant VM as virtmgr
participant CV as crosvm
participant pKVM as pKVM
participant Guest as Microdroid
App->>VS: createVm(VirtualMachineConfig)
VS->>VS: Allocate CID, create temp directory
VS->>VM: Spawn virtmgr process
App->>VM: start()
VM->>VM: Prepare disk images
VM->>VM: Create instance partition
VM->>CV: Fork + exec crosvm
CV->>pKVM: KVM_CREATE_VM (protected mode)
pKVM->>pKVM: Load pvmfw into guest
CV->>pKVM: KVM_RUN (start VCPUs)
Note over pKVM,Guest: pvmfw verifies kernel, derives DICE
Guest->>Guest: Boot Microdroid
Guest->>Guest: Start microdroid_manager
Guest->>Guest: Launch payload
Guest-->>VM: Payload ready callback (vsock)
VM-->>App: onPayloadReady()
Note over App,Guest: VM is now READY
App->>VM: stop()
VM->>Guest: shutdown() via guest agent
Guest->>Guest: sys.powerctl = shutdown
Guest->>Guest: SIGTERM to services
Guest-->>CV: VM exits
CV-->>VM: Process exit
VM-->>App: onDied()
54.6.7 The vm CLI Tool¶
The vm command-line tool at packages/modules/Virtualization/android/vm/src/main.rs
provides shell access to VM operations:
#[derive(Parser)]
enum Opt {
/// Check if the feature is enabled on device.
CheckFeatureEnabled { feature: String },
/// Run a virtual machine with a config in APK
RunApp { config: RunAppConfig },
/// Run a virtual machine with Microdroid inside
RunMicrodroid { config: RunMicrodroidConfig },
/// Run a virtual machine
Run { config: RunCustomVmConfig },
/// List running virtual machines
List,
/// Print information about virtual machine support
Info,
/// Create a new empty partition
CreatePartition { path, size, partition_type },
/// Creates or update the idsig file
CreateIdsig { apk, path },
/// Connect to the serial console of a VM
Console { cid: Option<i32> },
}
Common operations:
# Run Microdroid with default configuration
adb shell /apex/com.android.virt/bin/vm run-microdroid
# Run a protected Microdroid VM
adb shell /apex/com.android.virt/bin/vm run-microdroid --protected
# Run with custom memory and CPU topology
adb shell /apex/com.android.virt/bin/vm run-microdroid \
--mem 512 --cpu-topology match_host
# List running VMs
adb shell /apex/com.android.virt/bin/vm list
# Get VM support information
adb shell /apex/com.android.virt/bin/vm info
54.6.8 VM Configuration Types¶
Two configuration types are supported:
AppConfig -- For running payloads from an APK:
VirtualMachineConfig::AppConfig(VirtualMachineAppConfig {
name: "VmRunApp".to_string(),
apk: apk_fd.into(),
idsig: idsig_fd.into(),
instanceImage: open_parcel_file(&instance, true)?.into(),
instanceId: instance_id,
payload: Payload::PayloadConfig(VirtualMachinePayloadConfig {
payloadBinaryName: "MyPayload.so".to_string(),
extraApks: vec![],
}),
debugLevel: DebugLevel::FULL,
protectedVm: true,
memoryMib: 256,
cpuOptions: CpuOptions { cpuTopology: CpuTopology::MatchHost(true) },
osName: "microdroid".to_string(),
hugePages: false,
// ...
})
RawConfig -- For running custom VM configurations from a JSON file:
let config_file = File::open(&config_path)?;
let vm_config = VmConfig::load(&config_file)?.to_parcelable()?;
VirtualMachineConfig::RawConfig(vm_config)
54.6.9 composd: Trusted Compilation Service¶
The composd service orchestrates trusted compilation of ART artifacts inside
a VM. From packages/modules/Virtualization/android/composd/src/composd_main.rs:
fn try_main() -> Result<()> {
// ...
let virtmgr = vmclient::VirtualizationService::new()
.context("Failed to spawn VirtualizationService")?;
let virtualization_service = virtmgr.connect()
.context("Failed to connect to VirtualizationService")?;
let instance_manager = Arc::new(InstanceManager::new(virtualization_service));
let composd_service = service::new_binder(instance_manager);
register_lazy_service("android.system.composd", composd_service.as_binder())
.context("Registering composd service")?;
// ...
}
The composd architecture:
graph LR
subgraph "Host Android"
COMPOSD["composd"]
IM["InstanceManager"]
IS["InstanceStarter"]
end
subgraph "CompOS VM"
COMPOS["CompOS Service"]
ODREFRESH["odrefresh"]
DEX2OAT["dex2oat"]
end
COMPOSD --> IM
IM --> IS
IS -->|"create VM"| COMPOS
COMPOS --> ODREFRESH
COMPOS --> DEX2OAT
composd uses the VM to run dex2oat compilation in a trusted environment, ensuring that the compiled artifacts have not been tampered with. The output is signed with a key derived from the VM's DICE chain.
54.6.10 Shutdown Protocol¶
VM shutdown follows a graceful protocol as defined in
packages/modules/Virtualization/docs/shutdown.md:
sequenceDiagram
participant Host as VM Owner
participant VS as VirtualizationService
participant Agent as Guest Agent
participant Init as init
participant MM as microdroid_manager
participant Payload as Payload
Host->>VS: VirtualMachine.stop()
VS->>Agent: IGuestAgent.shutdown()
Agent->>Init: Set sys.powerctl = "shutdown"
Init->>Init: Start reboot sequence (2s timeout)
Init->>MM: SIGTERM
Init->>Payload: SIGTERM (via process group)
alt Payload handles SIGTERM
Payload->>Payload: Clean up
Payload-->>MM: Exit
else Timeout (2 seconds)
Init->>MM: SIGKILL
end
Init->>Init: All processes done
Init->>Init: Power down
Note over Host,VS: If no guest agent or 5s timeout
VS->>VS: SIGKILL to crosvm process
The graceful shutdown timeout hierarchy:
- Payload receives SIGTERM and should clean up promptly
- init waits 2 seconds (
ro.build.shutdown_timeout) before SIGKILL - VirtualizationService waits 5 seconds after calling the guest agent, then kills the crosvm process directly
54.6.11 Service VM¶
The Service VM is a special-purpose VM used for Remote Key Provisioning. From
packages/modules/Virtualization/guest/service_vm/README.md:
The Service VM is a lightweight, bare-metal virtual machine specifically designed to run various services for other virtual machines.
Key characteristics:
- Only one instance runs at a time
- Instance ID remains constant across updates
- Shares common code with pvmfw via
libvmbase - Processes CBOR-encoded requests over virtio-vsock
graph TB
subgraph "Service VM"
SVM["Service VM (bare-metal)"]
RKP_SVC["RKP Service"]
end
subgraph "Host"
VS["VirtualizationService"]
SVM_MGR["ServiceVmManager"]
end
subgraph "Client pVM"
CLIENT["pVM Payload"]
end
CLIENT -->|"attestation request"| VS
VS --> SVM_MGR
SVM_MGR -->|"manage lifecycle"| SVM
VS -->|"CBOR request via vsock"| RKP_SVC
RKP_SVC -->|"CBOR response"| VS
VS -->|"certificate"| CLIENT
54.6.12 Instance ID and CID Management¶
Each VM receives two identifiers:
-
Instance ID -- A 64-byte random identifier that persists across VM reboots. It is stored in a file and incorporated into DICE derivation for consistent secrets.
-
CID -- A 32-bit vsock Context ID allocated from the range 2048-65535. Used for host-guest communication.
Instance ID allocation from packages/modules/Virtualization/android/vm/src/run.rs:
let instance_id = {
let id_file = config.instance_id;
if id_file.exists() {
let mut id = [0u8; 64];
let mut instance_id_file = File::open(id_file)?;
instance_id_file.read_exact(&mut id)?;
id
} else {
let id = service.allocateInstanceId()
.context("Failed to allocate instance_id")?;
let mut instance_id_file = File::create(id_file)?;
instance_id_file.write_all(&id)?;
id
}
};
54.6.13 Tombstone Collection¶
VirtualizationService runs a tombstone receiver that listens for crash dumps from VMs over vsock. The receiver port is defined by the AIDL interface:
When a VM crashes, the tombstoned client in the guest sends the crash dump to the host, where it is stored using the standard Android tombstone infrastructure.
54.7 Hardware Capabilities¶
54.7.1 IVmCapabilitiesService HAL¶
The IVmCapabilitiesService HAL enables vendor-specific capabilities to be
granted to VMs. It is defined at
hardware/interfaces/virtualization/capabilities_service/aidl/android/hardware/virtualization/capabilities/IVmCapabilitiesService.aidl:
@VintfStability
interface IVmCapabilitiesService {
/**
* Grant access for the VM represented by the given vm_fd to the given
* vendor-owned tee services. The names in |vendorTeeServices| must match
* the ones defined in the tee_service_contexts files.
*/
void grantAccessToVendorTeeServices(
in ParcelFileDescriptor vmFd, in String[] vendorTeeServices);
}
As described in hardware/interfaces/virtualization/capabilities_service/README.md:
The IVmCapabilitiesService HAL is used in a flow to grant a pVM a capability to issue vendor-specific SMCs.
54.7.2 Implementation Structure¶
The HAL has three implementations:
hardware/interfaces/virtualization/capabilities_service/
aidl/ # Interface definition
default/ # Reference implementation for partners
noop/ # No-op implementation for Cuttlefish/testing
vts/ # VTS (Vendor Test Suite) tests
Default implementation at
hardware/interfaces/virtualization/capabilities_service/default/src/aidl.rs:
pub struct VmCapabilitiesService {}
impl IVmCapabilitiesService for VmCapabilitiesService {
fn grantAccessToVendorTeeServices(
&self,
vm_fd: &ParcelFileDescriptor,
tee_services: &[String]
) -> binder::Result<()> {
info!("received {vm_fd:?} {tee_services:?}");
// TODO(b/360102915): implement
Ok(())
}
}
No-op implementation at
hardware/interfaces/virtualization/capabilities_service/noop/src/aidl.rs:
pub struct NoOpVmCapabilitiesService {}
impl IVmCapabilitiesService for NoOpVmCapabilitiesService {
fn grantAccessToVendorTeeServices(
&self,
vm_fd: &ParcelFileDescriptor,
tee_services: &[String]
) -> binder::Result<()> {
info!("received {vm_fd:?} {tee_services:?}");
Ok(())
}
}
54.7.3 Service Registration¶
The default service registers as a lazy Binder service from
hardware/interfaces/virtualization/capabilities_service/default/src/main.rs:
const SERVICE_NAME: &str =
"android.hardware.virtualization.capabilities.IVmCapabilitiesService/default";
fn try_main() -> Result<()> {
android_logger::init_once(
android_logger::Config::default()
.with_tag("IVmCapabilitiesService")
.with_max_level(LevelFilter::Info)
.with_log_buffer(android_logger::LogId::System),
);
ProcessState::start_thread_pool();
let service_impl = VmCapabilitiesService::init();
let service = BnVmCapabilitiesService::new_binder(
service_impl, BinderFeatures::default()
);
register_lazy_service(SERVICE_NAME, service.as_binder())
.with_context(|| format!("failed to register {SERVICE_NAME}"))?;
ProcessState::join_thread_pool();
bail!("thread pool unexpectedly ended");
}
54.7.4 TEE Service Access Flow¶
The capability grant flow allows VMs to issue vendor-specific SMC (Secure Monitor Call) instructions to communicate with trusted execution environments:
sequenceDiagram
participant App as Android App
participant VS as VirtualizationService
participant CAPS as IVmCapabilitiesService
participant pKVM as pKVM
participant TEE as Vendor TEE
App->>VS: createVm(config with tee_services)
VS->>VS: Create VM, get vm_fd
VS->>CAPS: grantAccessToVendorTeeServices(vm_fd, services)
CAPS->>pKVM: Configure SMC filtering for VM
Note over App,TEE: VM is now running
App->>VS: (VM makes SMC call)
pKVM->>pKVM: Check SMC filter
alt Allowed
pKVM->>TEE: Forward SMC
TEE-->>pKVM: SMC response
else Denied
pKVM-->>App: Inject fault
end
54.7.5 Device Assignment¶
AVF supports hardware device assignment using VFIO-platform. This allows a VM to have direct access to physical hardware devices without host intervention.
From packages/modules/Virtualization/docs/device_assignment.md:
Device assignment allows a VM to have direct access to HW without host/hyp intervention. AVF uses
vfio-platformfor device assignment, and host kernel support is required.
The device assignment flow requires:
- A VM DTBO describing assignable devices
- Physical device nodes with IOMMU references
- VFIO-platform kernel driver support
The vm CLI supports device assignment through the --devices flag:
adb shell /apex/com.android.virt/bin/vm run-microdroid \
--devices /sys/bus/platform/devices/example-device
Device presence is checked by the vm info command:
if Path::new("/dev/vfio/vfio").exists() {
println!("/dev/vfio/vfio exists.");
}
if Path::new("/sys/bus/platform/drivers/vfio-platform").exists() {
println!("VFIO-platform is supported.");
}
54.7.6 Hypervisor Properties¶
AVF queries hypervisor capabilities through system properties, managed by the
hypervisor_props library:
let non_protected_vm_supported = hypervisor_props::is_vm_supported()?;
let protected_vm_supported = hypervisor_props::is_protected_vm_supported()?;
if let Some(version) = hypervisor_props::version()? {
println!("Hypervisor version: {version}");
}
Key system properties:
ro.boot.hypervisor.vm.supported-- Whether non-protected VMs are supportedro.boot.hypervisor.protected_vm.supported-- Whether pVMs are supportedro.boot.hypervisor.version-- Hypervisor version stringhypervisor.pvmfw.path-- Override path for pvmfw binary
54.8 Try It¶
54.8.1 Checking Device Support¶
First, verify that your device supports virtualization:
# Check for KVM support
adb shell ls -la /dev/kvm
# Check VM support via the vm tool
adb shell /apex/com.android.virt/bin/vm info
Expected output on a supported device:
Both protected and non-protected VMs are supported.
Hypervisor version: 1.0
/dev/kvm exists.
/dev/vfio/vfio does not exist.
VFIO-platform is not supported.
Assignable devices: []
Available OS list: ["microdroid"]
Debug policy: none
54.8.2 Running a Microdroid VM¶
The simplest way to run a VM is using the shell helper script:
# Run a non-protected Microdroid VM
packages/modules/Virtualization/android/vm/vm_shell.sh start-microdroid
# Run a protected Microdroid VM with auto-connect
packages/modules/Virtualization/android/vm/vm_shell.sh \
start-microdroid --auto-connect -- --protected
Or directly with the vm tool:
# Run Microdroid directly
adb shell /apex/com.android.virt/bin/vm run-microdroid
# Run protected with debug output
adb shell /apex/com.android.virt/bin/vm run-microdroid \
--protected \
--debug full \
--console /data/local/tmp/virt/console.txt \
--log /data/local/tmp/virt/log.txt
54.8.3 Building a Payload App¶
Create a minimal VM payload:
Native payload (C++):
// my_payload.cpp
#include <stdio.h>
extern "C" int AVmPayload_main() {
printf("Hello from Microdroid VM!\n");
// Payload code runs here
return 0;
}
Build rules (Android.bp):
cc_library_shared {
name: "MyMicrodroidPayload",
srcs: ["my_payload.cpp"],
shared_libs: ["libvm_payload#current"],
sdk_version: "current",
}
android_app {
name: "MyPayloadApp",
srcs: ["**/*.java"],
jni_libs: ["MyMicrodroidPayload"],
use_embedded_native_libs: true,
sdk_version: "current",
}
Run the payload:
# Build and install
TARGET_BUILD_APPS=MyPayloadApp m apps_only dist
adb install out/dist/MyPayloadApp.apk
# Get the installed APK path
APK_PATH=$(adb shell pm path com.example.mypayloadapp | cut -d: -f2)
# Run the VM
TEST_ROOT=/data/local/tmp/virt
adb shell /apex/com.android.virt/bin/vm run-app \
--log $TEST_ROOT/log.txt \
--console $TEST_ROOT/console.txt \
$APK_PATH \
$TEST_ROOT/MyPayloadApp.apk.idsig \
$TEST_ROOT/instance.img \
--instance-id-file $TEST_ROOT/instance_id \
--payload-binary-name MyMicrodroidPayload.so
54.8.4 Java API Usage¶
For programmatic VM management from an Android app:
// Create VM configuration
VirtualMachineConfig config = new VirtualMachineConfig.Builder(context)
.setPayloadBinaryName("MyMicrodroidPayload.so")
.setDebugLevel(VirtualMachineConfig.DEBUG_LEVEL_FULL)
.setProtectedVm(true)
.setMemoryBytes(256 * 1024 * 1024) // 256 MiB
.build();
// Create and start the VM
VirtualMachineManager vmm = context.getSystemService(VirtualMachineManager.class);
VirtualMachine vm = vmm.getOrCreate("my-vm", config);
vm.setCallback(executor, new VirtualMachineCallback() {
@Override
public void onPayloadStarted(VirtualMachine vm) {
Log.i(TAG, "Payload started");
}
@Override
public void onPayloadReady(VirtualMachine vm) {
Log.i(TAG, "Payload ready");
}
@Override
public void onPayloadFinished(VirtualMachine vm, int exitCode) {
Log.i(TAG, "Payload finished: " + exitCode);
}
@Override
public void onError(VirtualMachine vm, int errorCode, String message) {
Log.e(TAG, "VM error: " + message);
}
});
vm.run();
54.8.5 Running Tests¶
AVF includes comprehensive test suites:
# Run the main Microdroid host tests
atest MicrodroidHostTestCases
# Run the Microdroid app tests
atest MicrodroidTestApp
# Verify DICE chain validity (pVM required)
atest MicrodroidTests#protectedVmHasValidDiceChain
54.8.6 Debugging VMs¶
Console output:
# Direct console to a file
adb shell /apex/com.android.virt/bin/vm run-microdroid \
--console /data/local/tmp/console.txt
# Read console output
adb shell cat /data/local/tmp/console.txt
GDB debugging:
# Start VM with GDB server
adb shell /apex/com.android.virt/bin/vm run-microdroid \
--debug full --gdb 1234
# Connect GDB (from host)
adb forward tcp:1234 tcp:1234
gdb-multiarch -ex "target remote :1234"
Early console (earlycon):
# Enable earlycon for early boot debugging
adb shell /apex/com.android.virt/bin/vm run-microdroid \
--debug full --enable-earlycon
Listing running VMs:
Device tree dump:
# Dump the VM's device tree for inspection
adb shell /apex/com.android.virt/bin/vm run-microdroid \
--dump-device-tree /data/local/tmp/vm_dt.dtb
54.8.7 Custom VM Configuration¶
For advanced use cases, you can create a custom VM configuration:
{
"name": "my-custom-vm",
"kernel": "/data/local/tmp/Image",
"initrd": "/data/local/tmp/initramfs.img",
"params": "console=hvc0 earlycon=uart8250,mmio,0x3f8",
"disks": [
{
"partitions": [
{
"label": "rootfs",
"path": "/data/local/tmp/rootfs.img"
}
],
"writable": false
}
],
"protected": false,
"memory_mib": 512,
"platform_version": "~1.0"
}
Run with:
adb push my_vm_config.json /data/local/tmp/
adb shell /apex/com.android.virt/bin/vm run /data/local/tmp/my_vm_config.json
54.8.8 Inspecting AVF Components¶
APEX contents:
# List what's inside the AVF APEX
adb shell ls -la /apex/com.android.virt/
# Check the pvmfw binary
adb shell ls -la /apex/com.android.virt/etc/pvmfw.bin
# Check the Microdroid images
adb shell ls -la /apex/com.android.virt/etc/fs/
System properties:
# Check hypervisor status
adb shell getprop ro.boot.hypervisor.vm.supported
adb shell getprop ro.boot.hypervisor.protected_vm.supported
adb shell getprop ro.boot.hypervisor.version
# Check AVF features
adb shell /apex/com.android.virt/bin/vm check-feature-enabled remote_attestation
adb shell /apex/com.android.virt/bin/vm check-feature-enabled vendor_modules
adb shell /apex/com.android.virt/bin/vm check-feature-enabled device_assignment
54.8.9 Building AVF from Source¶
To build the complete AVF stack from AOSP source:
# Set up build environment
source build/envsetup.sh
lunch aosp_cf_x86_64_phone-userdebug # or aosp_panther-userdebug for Pixel 7
# Build the entire system (including AVF)
m
# Or build just the AVF APEX for faster iteration
banchan com.android.virt aosp_arm64 # or aosp_x86_64
UNBUNDLED_BUILD_SDKS_FROM_SOURCE=true m apps_only dist
# Install the APEX
adb install out/dist/com.android.virt.apex
adb reboot
54.8.10 Troubleshooting¶
VM fails to start:
- Check
/dev/kvmexists:adb shell ls -la /dev/kvm - Verify APEX is installed:
adb shell pm list packages | grep virt - Check logcat:
adb logcat -s VirtualizationService:* virtmgr:* crosvm:*
Protected VM fails:
- Verify pKVM is enabled:
adb shell getprop ro.boot.hypervisor.protected_vm.supported - Check pvmfw path:
adb shell getprop hypervisor.pvmfw.path - Check pvmfw reboot reasons in console output
Performance issues:
- Use
--hugepagesfor transparent huge pages support - Use
--cpu-topology match_hostto match host CPU topology - Use
--boost-uclampfor benchmarking stability
54.8.11 Remote Attestation Demo¶
The VmAttestationDemoApp at packages/modules/Virtualization/android/VmAttestationDemoApp/
demonstrates how a pVM payload can request remote attestation:
// Inside VM payload
extern "C" int AVmPayload_main() {
// Generate a challenge (typically from a remote server)
uint8_t challenge[32];
// ... fill challenge from server ...
// Request attestation
AVmAttestationResult* result = nullptr;
int status = AVmPayload_requestAttestation(challenge, sizeof(challenge), &result);
if (status != 0) {
// Attestation failed
return status;
}
// Use the attestation result
// - Get the certificate chain
// - Get the attested private key
// - Send certificate to remote server for verification
AVmPayload_freeAttestationResult(result);
return 0;
}
The attestation flow within the device:
sequenceDiagram
participant Payload as pVM Payload
participant MM as microdroid_manager
participant VS as VirtualizationService
participant SVM as Service VM (RKP)
participant RKP as RKP Server
Payload->>MM: AVmPayload_requestAttestation(challenge)
MM->>VS: Forward attestation request
VS->>SVM: Start Service VM (if not running)
VS->>SVM: Send CSR + pVM DICE chain
SVM->>SVM: Validate pVM DICE chain
SVM->>RKP: Submit RKP VM DICE chain + CSR
RKP->>RKP: Verify RKP VM identity
RKP-->>SVM: Signed certificate chain
SVM-->>VS: Attestation result
VS-->>MM: Certificate chain + key
MM-->>Payload: AVmAttestationResult
54.9 Rollback Protection¶
54.9.1 Overview¶
Rollback protection prevents an attacker from running an older, vulnerable version of a VM payload and accessing secrets that were provisioned to a newer version. pvmfw implements multiple rollback protection strategies, selected based on the VM type and platform capabilities.
From packages/modules/Virtualization/guest/pvmfw/src/rollback.rs:
pub fn perform_rollback_protection(
fdt: &Fdt,
verified_boot_data: &VerifiedBootData,
dice_inputs: &PartialInputs,
cdi_seal: &[u8],
) -> Result<(bool, Hidden, bool), RebootReason> {
let instance_hash = dice_inputs.instance_hash;
if let Some(fixed) = get_fixed_rollback_protection(verified_boot_data) {
perform_fixed_rollback_protection(verified_boot_data, fixed)?;
Ok((false, instance_hash.unwrap(), false))
} else if (should_defer_rollback_protection(fdt)?
&& verified_boot_data.has_capability(Capability::SecretkeeperProtection))
|| verified_boot_data.has_capability(Capability::TrustySecurityVm)
{
perform_deferred_rollback_protection(verified_boot_data)?;
Ok((false, instance_hash.unwrap(), true))
} else if cfg!(feature = "instance-img") {
perform_legacy_rollback_protection(fdt, dice_inputs, cdi_seal, instance_hash)
} else {
force_new_instance()
}
}
54.9.2 Rollback Protection Strategies¶
graph TB
START["perform_rollback_protection()"] --> CHECK_FIXED{"Is well-known VM?\n(RKP VM, Trusty)"}
CHECK_FIXED -->|Yes| FIXED["Fixed RBP:\nMatch exact rollback index\nor kernel hash"]
CHECK_FIXED -->|No| CHECK_DEFER{"Can defer RBP?\n(Secretkeeper capable)"}
CHECK_DEFER -->|Yes| DEFER["Deferred RBP:\nGuest handles own protection\nvia Secretkeeper"]
CHECK_DEFER -->|No| CHECK_INSTANCE{"instance-img\nfeature enabled?"}
CHECK_INSTANCE -->|Yes| LEGACY["Legacy RBP:\nUse instance.img\nblock device"]
CHECK_INSTANCE -->|No| NEW["Force new instance:\nRandom salt each boot"]
FIXED --> DONE["Return salt + status"]
DEFER --> DONE
LEGACY --> DONE
NEW --> DONE
Fixed Rollback Protection -- For well-known system VMs with specific identity:
enum FixedRollbackCriterion {
/// Image must match the exact kernel hash.
KernelHash { digest: Digest },
/// Image must match the exact rollback index and public key.
RollbackIndexPublicKey { index: u64, public_key: &'static [u8] },
/// Reserved name not supported on this platform.
Reserved { name: &'static str },
}
The RKP VM uses rollback index + public key verification:
match verified_boot_data.name.as_deref()? {
VerifiedBootData::RKP_VM_NAME =>
Some(FixedRollbackCriterion::RollbackIndexPublicKey {
index: service_vm_version::VERSION,
public_key: PUBLIC_KEY,
}),
VerifiedBootData::DESKTOP_TRUSTY_VM_NAME => {
// Platform-specific: kernel hash verification
}
_ => None,
}
Deferred Rollback Protection -- The guest handles its own rollback protection through Secretkeeper. pvmfw only validates that the rollback index is positive:
fn perform_deferred_rollback_protection(
verified_boot_data: &VerifiedBootData,
) -> Result<(), RebootReason> {
info!("Deferring rollback protection");
if verified_boot_data.rollback_index == 0 {
error!("Expected positive rollback_index, found 0");
Err(RebootReason::InvalidPayload)
} else {
Ok(())
}
}
Legacy Rollback Protection -- Uses the instance.img block device to store recorded DICE measurements. On subsequent boots, pvmfw compares current measurements against the recorded entry:
fn ensure_dice_measurements_match_entry(
dice_inputs: &PartialInputs,
entry: &EntryBody,
) -> Result<(), InstanceError> {
if entry.code_hash != dice_inputs.code_hash {
Err(InstanceError::RecordedCodeHashMismatch)
} else if entry.auth_hash != dice_inputs.auth_hash {
Err(InstanceError::RecordedAuthHashMismatch)
} else if entry.mode() != dice_inputs.mode {
Err(InstanceError::RecordedDiceModeMismatch)
} else {
Ok(())
}
}
54.10 Configuration Data Deep Dive¶
54.10.1 Config Parser Implementation¶
The pvmfw configuration parser at
packages/modules/Virtualization/guest/pvmfw/src/config/mod.rs implements rigorous
validation of the configuration data appended by the bootloader:
impl Header {
const MAGIC: u32 = u32::from_ne_bytes(*b"pvmf");
const VERSION_1_0: Version = Version { major: 1, minor: 0 };
const VERSION_1_1: Version = Version { major: 1, minor: 1 };
const VERSION_1_2: Version = Version { major: 1, minor: 2 };
const VERSION_1_3: Version = Version { major: 1, minor: 3 };
}
The parser validates:
- Magic number (
0x666d7670= "pvmf" in little-endian) - Version compatibility
- Total size fits within the reserved region
- All entry offsets and sizes are within bounds
- Entries are in order (no overlapping)
54.10.2 Entry Types¶
The configuration entries are defined as an enum with a count sentinel:
#[derive(Clone, Copy, Debug)]
pub enum Entry {
DiceHandover, // Entry 0: DICE chain (mandatory)
DebugPolicy, // Entry 1: Debug policy DTBO (optional)
VmDtbo, // Entry 2: Device assignment DTBO (v1.1)
VmBaseDtbo, // Entry 3: VM reference DT (v1.2)
ReservedMem, // Entry 4: Reserved memory (v1.3)
_VARIANT_COUNT, // Sentinel for counting
}
The entries structure that main receives:
#[derive(Default)]
pub struct Entries<'a> {
pub dice_handover: Option<&'a mut [u8]>, // Mutable: will be zeroized
pub debug_policy: Option<&'a [u8]>, // Read-only
pub vm_dtbo: Option<&'a mut [u8]>, // Mutable: DTBO processing
pub vm_ref_dt: Option<&'a [u8]>, // Read-only
pub reserved_mem: Option<&'a mut [u8]>, // Mutable: will be zeroized
}
Note the careful ownership: mutable references are used for entries that contain secrets (DICE handover, reserved memory) so they can be zeroized after use. Read-only references are used for entries that only need inspection.
54.10.3 Version Negotiation¶
The parser handles forward compatibility by treating unknown minor versions as the latest known version:
pub fn entry_count(&self) -> Result<usize> {
let last_entry = match self.version {
Self::VERSION_1_0 => Entry::DebugPolicy,
Self::VERSION_1_1 => Entry::VmDtbo,
Self::VERSION_1_2 => Entry::VmBaseDtbo,
Self::VERSION_1_3 => Entry::ReservedMem,
v @ Version { major: 1, .. } => {
const LATEST: Version = Header::VERSION_1_3;
warn!("Parsing unknown config data version {v} as version {LATEST}");
return Ok(Entry::COUNT);
}
v => return Err(Error::UnsupportedVersion(v)),
};
Ok(last_entry as usize + 1)
}
This means a v1.4 config will be parsed as v1.3, with any new entries beyond the known set silently ignored. Major version changes (2.x) would be rejected.
54.10.4 Error Handling¶
The config module defines precise error variants for each failure mode:
pub enum Error {
BufferTooSmall,
HeaderMisaligned,
InvalidMagic,
UnsupportedVersion(Version),
InvalidSize(usize),
MissingEntry(Entry),
EntryOutOfBounds(Entry, Range<usize>, Range<usize>),
EntryOutOfOrder,
}
Each error produces a clear diagnostic message. The InvalidMagic error has
special handling -- it triggers the legacy DICE handover path for backward
compatibility with Android T:
match config::Config::new(data) {
Ok(valid) => Some(Self::Config(valid)),
Err(config::Error::InvalidMagic) if cfg!(feature = "compat-raw-dice-handover") => {
warn!("Assuming the appended data to be a raw DICE handover");
Some(Self::LegacyDiceHandover(&mut data[..DICE_CHAIN_SIZE]))
}
Err(e) => {
error!("Invalid configuration data at {data_ptr:?}: {e}");
None
}
}
54.11 Device Tree Handling in pvmfw¶
54.11.1 FDT Sanitization¶
The device tree provided by the VMM is untrusted and must be sanitized before use. pvmfw uses a template-based approach, starting from a known-good FDT template and selectively copying validated properties from the untrusted FDT.
From packages/modules/Virtualization/guest/pvmfw/src/fdt.rs:
// Architecture-specific FDT templates
#[cfg(target_arch = "aarch64")]
const FDT_TEMPLATE: &Fdt = unsafe {
Fdt::unchecked_from_slice(pvmfw_fdt_template::RAW)
};
#[cfg(target_arch = "x86_64")]
const FDT_TEMPLATE: &Fdt = unsafe {
Fdt::unchecked_from_slice(pvmfw_fdt_template::RAW_X86_64)
};
The FDT validation catches several error conditions:
pub enum FdtValidationError {
/// Invalid CPU count.
InvalidCpuCount(usize),
/// Invalid VCpufreq Range.
InvalidVcpufreq(u64, u64),
/// Forbidden /avf/untrusted property.
ForbiddenUntrustedProp(&'static CStr),
}
54.11.2 Device Tree Modification for Next Stage¶
After sanitization, pvmfw modifies the FDT to pass information to the guest kernel:
-
DICE chain -- Added as a
/reserved-memory/dicenode withcompatible = "google,open-dice" -
KASLR seed -- Random seed for kernel address space layout randomization
- Boot parameters -- Debug level, instance status
- Reserved memory -- Confidential data regions
- Device assignment info -- If device passthrough is configured
The reserved-memory DICE node format:
/ {
reserved-memory {
#address-cells = <0x02>;
#size-cells = <0x02>;
ranges;
dice {
compatible = "google,open-dice";
no-map;
reg = <0x0 0x7fe0000>, <0x0 0x1000>;
};
};
};
54.11.3 Security Boundary at the FDT¶
The FDT represents a critical security boundary. The VMM constructs the FDT to describe the virtual platform, but in the protected VM threat model, the VMM is untrusted. pvmfw must therefore:
- Never trust device addresses or sizes from the untrusted FDT without validation
- Never trust the number of CPUs or memory layout without bounds checking
-
Validate that properties critical to security (like the DICE chain location) are correctly formed
-
Replace the untrusted FDT with a sanitized version before handing off to the guest kernel
This is why pvmfw starts from a template FDT rather than modifying the VMM-provided one in place -- it ensures the guest receives a device tree that only contains known-safe contents.
54.12 vmbase: Common VM Base Library¶
54.12.1 Purpose¶
The vmbase library at packages/modules/Virtualization/libs/libvmbase/ provides
shared low-level infrastructure for bare-metal Rust binaries running in crosvm VMs.
Both pvmfw and the Service VM build upon vmbase.
From the vmbase README:
This directory contains a Rust crate and static library which can be used to write
no_stdRust binaries to run in an aarch64 VM under crosvm (via the VirtualizationService), such as for pVM firmware, a VM bootloader or kernel.
54.12.2 Provided Infrastructure¶
vmbase provides:
-
Entry point -- Initializes the MMU with identity mapping, enables cache, prepares the image, and allocates a stack
-
Exception vector -- Calls user-defined exception handlers
- UART driver -- Console logging via
println!at MMIO address0x3f8 - Power management --
shutdown()andreboot()via PSCI calls - Heap allocation -- Configurable heap for
no_stdbinaries - Page table manipulation -- Memory management unit setup
- PSCI calls -- Power State Coordination Interface wrappers
54.12.3 Source Organization¶
packages/modules/Virtualization/libs/libvmbase/
arch/ # Architecture-specific code
arch.rs # Architecture abstraction
bionic.rs # Bionic compatibility shims
bzimage.rs # bzImage (Linux) boot support
console.rs # Console output
entry.rs # Entry point macros
fdt/ # Flattened Device Tree support
fdt.rs # FDT utilities
heap.rs # Heap allocator
layout.rs # Memory layout definitions
lib.rs # Crate root
linker.rs # Linker support
logger.rs # Logging infrastructure
memory/ # Memory management
memory.rs # Memory tracking
mmu.rs # Memory Management Unit
power.rs # PSCI power management
rand.rs # Random number generation
uart.rs # UART driver
util.rs # Utilities
virtio/ # VirtIO device support
virtio.rs # VirtIO abstractions
54.12.4 Using vmbase for Custom Binaries¶
A minimal vmbase binary requires:
#![no_main]
#![no_std]
use vmbase::{logger, main};
use log::{info, LevelFilter};
main!(main);
pub fn main(arg0: u64, arg1: u64, arg2: u64, arg3: u64) {
logger::init(LevelFilter::Info).unwrap();
info!("Hello world");
}
The build system uses a combination of rust_ffi_static and cc_binary rules
with custom linker scripts:
rust_ffi_static {
name: "libvmbase_example",
defaults: ["vmbase_ffi_defaults"],
crate_name: "vmbase_example",
srcs: ["src/main.rs"],
rustlibs: ["libvmbase"],
}
The entry point macro wraps the user function with:
- Console driver initialization (UART at
0x3f8) - Stack setup
- PSCI
SYSTEM_OFFcall on return
54.12.5 Memory Management in vmbase¶
The memory.rs module in pvmfw uses vmbase's memory tracking:
pub(crate) struct MemorySlices<'a> {
pub fdt: &'a mut libfdt::Fdt,
pub kernel: &'a [u8],
pub ramdisk: Option<&'a [u8]>,
pub preserved_memory: Option<&'a [u8]>,
pub boot_params: Option<&'a mut bzimage::boot_params>,
}
Memory regions are mapped with explicit read-only or read-write permissions:
fn map_data_slice_mut<'a>(addr: usize, size: usize)
-> Result<&'a mut [u8], MemoryTrackerError>
{
let nonzero_size = size.try_into().map_err(|_| {
error!("Invalid size specified for the range: {size:#x}");
MemoryTrackerError::SizeTooSmall
})?;
map_data(addr, nonzero_size)?;
let mut_slice = unsafe {
slice::from_raw_parts_mut(addr as *mut u8, size)
};
Ok(mut_slice)
}
fn map_data_slice<'a>(addr: usize, size: usize)
-> Result<&'a [u8], MemoryTrackerError>
{
let nonzero_size = size.try_into().map_err(|e| {
error!("Invalid size specified for the range: {e}");
MemoryTrackerError::SizeTooSmall
})?;
map_rodata(addr, nonzero_size)?;
let slice = unsafe {
slice::from_raw_parts(addr as *const u8, size)
};
Ok(slice)
}
This separation ensures that code regions (kernel image) are mapped read-only while data regions (FDT, ramdisk) are mapped read-write as needed.
54.13 Device Assignment in Detail¶
54.13.1 Architecture¶
Device assignment (also called device passthrough) allows a VM to directly access physical hardware devices without host/hypervisor intervention on the data path. AVF uses VFIO-platform for this purpose.
From packages/modules/Virtualization/docs/device_assignment.md:
Device assignment allows a VM to have direct access to HW without host/hyp intervention. AVF uses
vfio-platformfor device assignment, and host kernel support is required.
graph TB
subgraph "Host"
VFIO["VFIO-platform Driver"]
IOMMU["Physical IOMMU"]
end
subgraph "pKVM"
S2["Stage-2 Tables"]
DA["Device Assignment\nValidation"]
end
subgraph "VM"
GUEST_DRV["Guest Device Driver"]
end
subgraph "Hardware"
DEV["Physical Device"]
end
GUEST_DRV -->|"MMIO access"| S2
S2 -->|"direct"| DEV
DEV -->|"DMA"| IOMMU
IOMMU -->|"translated"| S2
VFIO -->|"unbind from host"| DEV
DA -->|"validate DTBO"| S2
54.13.2 VM DTBO Structure¶
The VM Device Tree Blob Overlay (DTBO) describes assignable devices. It has two sections:
Overlayable devices (applied to VM DT):
// Devices visible to the VM
&{/} {
my_device@12340000 {
compatible = "vendor,my-device";
reg = <0x0 0x12340000 0x0 0x1000>;
interrupts = <0 42 4>;
};
};
Physical device descriptions (not applied, used for verification):
/host {
// Physical IOMMU
iommu@0 {
#iommu-cells = <1>;
android,pvmfw,token = <0x0 0x12345678>;
};
// Physical device
phys_device@abcd0000 {
reg = <0x0 0xabcd0000 0x0 0x1000>;
iommus = <&iommu 0x1>;
android,pvmfw,target = <&my_device>;
};
};
54.13.3 pvmfw Device Assignment Validation¶
The pvmfw device assignment module at
packages/modules/Virtualization/guest/pvmfw/src/device_assignment.rs validates
the DTBO against the physical platform:
pub enum DeviceAssignmentError {
InvalidDtbo,
InvalidSymbols,
MalformedReg,
MissingReg(u64, u64),
ExtraReg(u64, u64),
InvalidReg(u64),
InvalidRegToken(u64, u64),
InvalidRegSize(u64, u64),
InvalidInterrupts,
MalformedIommus,
InvalidIommus,
InvalidPhysIommu,
InvalidPvIommu,
TooManyPvIommu,
DuplicatedIommuIds,
DuplicatedPvIommuIds,
UnsupportedPathFormat,
// ... additional error variants
}
The validation ensures:
- Physical register addresses match what the hypervisor reports
- IOMMU tokens are valid and consistent
- Device nodes reference valid overlayable targets
- No duplicate IOMMU or device entries exist
54.13.4 IOMMU Token Verification¶
Each IOMMU in the VM DTBO carries a token -- a hypervisor-specific 64-bit value that uniquely identifies a physical IOMMU. pvmfw validates these tokens against what the hypervisor reports:
sequenceDiagram
participant ABL as Bootloader
participant pKVM as pKVM
participant PVMFW as pvmfw
ABL->>pKVM: Provide VM DTBO with IOMMU tokens
Note over ABL,pKVM: Tokens must be constant across boots
pKVM->>PVMFW: Load pvmfw + config (includes VM DTBO)
PVMFW->>pKVM: Query device IOMMU bindings
pKVM-->>PVMFW: Physical IOMMU tokens
PVMFW->>PVMFW: Validate DTBO tokens match pKVM tokens
alt Tokens match
PVMFW->>PVMFW: Apply DTBO to VM device tree
else Tokens mismatch
PVMFW->>PVMFW: Reject device assignment
end
54.14 Async I/O in crosvm¶
54.14.1 cros_async Runtime¶
crosvm includes its own async runtime (cros_async) that provides two executor
backends:
- io_uring -- Uses Linux io_uring for high-performance asynchronous I/O
- epoll -- Falls back to epoll-based polling
From the code organization in external/crosvm/ARCHITECTURE.md:
cros_async- Runtime for async/await programming. This crate provides aFutureexecutor based onio_uringand one based onepoll.
The executor type can be configured at VM startup:
if let Some(async_executor) = cfg.async_executor {
cros_async::Executor::set_default_executor_kind(async_executor)
.context("Failed to set the default async executor")?;
}
54.14.2 Virtio Queue Processing¶
Each virtio device's worker thread uses the async runtime for queue processing. The general pattern (simplified from the architecture doc):
// Worker thread for a virtio device (conceptual)
async fn process_queue(
queue: Queue,
mem: GuestMemory,
interrupt: Interrupt,
) -> Result<()> {
loop {
// Wait for the guest to submit descriptors
let desc_chain = queue.next_async(&mem).await?;
// Process the request
let response = handle_request(&desc_chain, &mem)?;
// Write response and signal completion
queue.add_used(&mem, desc_chain.index, response.len());
interrupt.signal_used_queue(queue.vector());
}
}
54.14.3 VirtIO Transport¶
For protected VMs, the virtio transport operates over shared memory regions. The guest must explicitly share the memory used for virtio rings with the host using pKVM hypercalls:
graph LR
subgraph "Guest Memory (Protected)"
PRIV["Private Data"]
end
subgraph "Shared Memory"
VRING["Virtio Rings\n(descriptor table,\navailable ring,\nused ring)"]
BUFFERS["Data Buffers\n(for I/O)"]
end
subgraph "Host/crosvm"
DEV["Device Backend"]
end
PRIV -.->|"Copy to shared"| BUFFERS
VRING <-->|"MMIO trap"| DEV
BUFFERS <-->|"DMA"| DEV
54.15 Network and Display Support¶
54.15.1 Network Support¶
AVF provides optional network support for VMs through the vmnic and
vmtethering services. Network capability is gated behind a feature flag:
When enabled, the VM configuration includes:
The network stack uses virtio-net for guest-host communication, with the
VmTethering service handling NAT/tethering on the host side.
54.15.2 Display Support¶
The TerminalApp at packages/modules/Virtualization/android/TerminalApp/
provides a terminal interface for VM interaction. Display forwarding uses
the display_service registered with VirtualizationService:
pub struct VirtualizationServiceInternal {
state: Arc<Mutex<GlobalState>>,
display_service_set: Arc<Condvar>,
// ...
}
54.16 Running Linux with Graphics Acceleration¶
Android's Virtualization Framework (AVF) supports running full Linux distributions (Debian) inside VMs with hardware-accelerated graphics. This enables a desktop Linux experience — including GUI applications, browsers, and development tools — running alongside Android apps on the same device.
54.16.1 Architecture Overview¶
The Linux VM stack combines several components:
graph TB
subgraph Android["Android Host"]
TA["TerminalApp<br/>DisplayActivity"]
SV["SurfaceView<br/>Display output"]
IF["InputForwarder<br/>Touch/keyboard/mouse"]
VMS["VmLauncherService<br/>VM lifecycle"]
ADS["Android Display<br/>Backend (C++)"]
TA --> SV
TA --> IF
TA --> VMS
VMS --> ADS
end
subgraph VM["Linux Guest VM (Debian)"]
KERN["Linux Kernel<br/>virtio drivers"]
DESK["Desktop Environment<br/>GUI applications"]
KERN --> DESK
end
subgraph crosvm["crosvm VMM"]
VGPU["virtio-gpu<br/>gfxstream / 2D"]
VINP["virtio-input<br/>evdev forwarding"]
VNET["virtio-net<br/>Network"]
VBLK["virtio-blk<br/>Root filesystem"]
end
SV <-->|"ANativeWindow<br/>surface buffer"| ADS
ADS <-->|"ICrosvmAndroid<br/>DisplayService"| VGPU
IF -->|"VirtualMachine<br/>sendKeyEvent()"| VINP
KERN <--> VGPU
KERN <--> VINP
KERN <--> VNET
KERN <--> VBLK
54.16.2 TerminalApp: The Linux VM Frontend¶
The TerminalApp at packages/modules/Virtualization/android/TerminalApp/
is the Android-side UI for Linux VMs. It manages the full lifecycle:
VM Launch Flow¶
sequenceDiagram
participant User
participant TA as TerminalApp
participant VMS as VmLauncherService
participant VMM as VirtualMachineManager
participant CV as crosvm
User->>TA: Open Terminal App
TA->>VMS: startService(displayInfo)
VMS->>VMS: Parse vm_config.json
VMS->>VMS: Configure GPU (gfxstream or 2D)
VMS->>VMM: create("debian", config)
VMM->>CV: Launch crosvm with virtio devices
CV-->>VMS: VM running
VMS->>TA: VM_LAUNCHER_SERVICE_READY
TA->>TA: Start DisplayActivity
TA->>VMS: Connect display surface
Note over TA,CV: Display output flows<br/>Guest → virtio-gpu → crosvm → Android Surface
// Source: packages/modules/Virtualization/android/TerminalApp/java/.../VmLauncherService.kt:67
// VmLauncherService manages VM lifecycle, GPU config, disk management
// Launches Debian VM with display, audio, input, and network
Display Configuration¶
The VM display adapts to the Android device's screen:
// Source: packages/modules/Virtualization/android/TerminalApp/java/.../VmLauncherService.kt:622
data class DisplayInfo(
val width: Int, // Device display width
val height: Int, // Device display height
val dpi: Int, // Pixel density
val refreshRate: Int // Display refresh rate
) : Parcelable
54.16.3 Graphics Acceleration Modes¶
The Linux VM supports two GPU rendering modes:
| Mode | Backend | Rendering | Performance | Use Case |
|---|---|---|---|---|
| Gfxstream | gfxstream |
Host GPU via Vulkan | Near-native | Devices with GPU support |
| Lavapipe | 2d |
Software (CPU-based) | Slow but universal | Fallback / testing |
Gfxstream Configuration¶
When hardware GPU acceleration is available, the VM uses gfxstream to forward Vulkan commands from the guest to the host GPU:
// Source: packages/modules/Virtualization/android/TerminalApp/java/.../VmLauncherService.kt:355
if (isGfxstreamEnabled()) {
builder.setGpuConfig(
GpuConfig.Builder()
.setBackend("gfxstream")
.setRendererUseSurfaceless(true)
.setRendererUseVulkan(true)
.setContextTypes(arrayOf("gfxstream-vulkan", "gfxstream-composer"))
.setRendererFeatures("VulkanDisableCoherentMemoryAndEmulate:enabled")
.build()
)
}
The GPU configuration supports these parameters:
// Source: packages/modules/Virtualization/.../VirtualMachineCustomImageConfig.java:911
class GpuConfig {
String backend; // "gfxstream" or "2d"
String[] contextTypes; // ["gfxstream-vulkan", "gfxstream-composer"]
boolean rendererUseEgl;
boolean rendererUseGles;
boolean rendererUseSurfaceless;
boolean rendererUseVulkan;
String rendererFeatures; // Feature flags
String pciAddress; // GPU PCI address
}
Graphics Acceleration Selection¶
The GraphicsManager lets users choose between hardware and software
rendering:
// Source: packages/modules/Virtualization/android/TerminalApp/java/.../GraphicsManager.kt
// Checks R.bool.gfxstream_supported (default: false, overridable per device)
// Persists selection in SharedPreferences
Device manufacturers enable gfxstream by overriding the resource:
<!-- Source: packages/modules/Virtualization/android/TerminalApp/res/values/config.xml:20 -->
<bool name="gfxstream_supported">false</bool>
<!-- Device overlay sets to true when host GPU supports gfxstream -->
54.16.4 Display Forwarding Pipeline¶
The display pipeline bridges the Linux guest's framebuffer to an Android
SurfaceView:
graph LR
subgraph Guest["Linux Guest"]
MESA["Mesa / virtio-gpu<br/>DRM driver"]
end
subgraph crosvm["crosvm"]
VGPU["virtio-gpu device"]
ADB["Android Display<br/>Backend"]
end
subgraph Android["Android"]
ANW["ANativeWindow"]
SC["SurfaceControl"]
SF["SurfaceFlinger"]
SCREEN["Screen"]
end
MESA -->|"virtio-gpu<br/>commands"| VGPU
VGPU -->|"Render to<br/>surface"| ADB
ADB -->|"Lock buffer<br/>draw pixels<br/>post buffer"| ANW
ANW --> SC
SC --> SF
SF --> SCREEN
ICrosvmAndroidDisplayService AIDL¶
The crosvm GPU backend communicates with Android through a Binder interface:
// Source: packages/modules/Virtualization/libs/android_display_backend/aidl/
// android/crosvm/ICrosvmAndroidDisplayService.aidl
interface ICrosvmAndroidDisplayService {
void setSurface(in Surface surface, boolean forCursor);
void removeSurface(boolean forCursor);
void setCursorStream(in ParcelFileDescriptor stream);
void saveFrameForSurface(boolean forCursor);
void drawSavedFrameForSurface(boolean forCursor);
}
The display backend manages two surfaces — MAIN for the desktop and CURSOR for the mouse pointer:
// Source: packages/modules/Virtualization/android/TerminalApp/java/.../DisplayProvider.kt
// Manages Surface lifecycle for MAIN and CURSOR
// Cursor position streamed via socket (8-byte x,y coordinates per update)
Android Display Backend (C++)¶
The native backend interfaces with Android's graphics stack:
// Source: packages/modules/Virtualization/libs/android_display_backend/
// crosvm_android_display_client.cpp:81
class AndroidDisplaySurface {
// Lock ANativeWindow buffer for GPU rendering
// Post rendered frame via SurfaceControl
// Direct AHardwareBuffer sharing for zero-copy display
// Pixel format: HAL_PIXEL_FORMAT_BGRA_8888
};
54.16.5 Input Forwarding¶
Android input events (touch, keyboard, mouse, trackpad) are forwarded to the Linux guest as evdev events:
Key Code Translation¶
// Source: packages/modules/Virtualization/android/TerminalApp/java/
// .../DisplaySurfaceView.kt:37-110
// 60+ Android key codes mapped to Linux evdev scan codes:
// KEYCODE_A → 0x1E (KEY_A)
// KEYCODE_ENTER → 0x1C (KEY_ENTER)
// KEYCODE_ESC → 0x01 (KEY_ESC)
// KEYCODE_TAB → 0x0F (KEY_TAB)
// Special handling for SHIFT+key combinations
Input Mode Detection¶
The InputForwarder automatically adapts to the input device:
// Source: packages/modules/Virtualization/android/TerminalApp/java/
// .../InputForwarder.kt:111-137
// Detects physical keyboard → enables mouse pointer capture
// Touch-only → touch events scaled to VM display dimensions
// Trackpad → separate mouse input path
Touch coordinates are scaled from the Android SurfaceView dimensions to the VM's configured display resolution.
54.16.6 Debian VM Configuration¶
Linux VMs are configured via a JSON file that maps to
VirtualMachineCustomImageConfig:
// Source: packages/modules/Virtualization/build/debian/vm_config.json
{
"name": "debian",
"kernel": "$PAYLOAD_DIR/vmlinuz",
"initrd": "$PAYLOAD_DIR/initrd.img",
"disks": [
{ "image": "$PAYLOAD_DIR/root_part", "writable": true, "partitions": [...] }
],
"cpu_topology": "match_host",
"memory_mib": 4096,
"network": true,
"auto_memory_balloon": true,
"gpu": { "backend": "2d" },
"protected": false,
"debuggable": true,
"input": {
"keyboard": true,
"mouse": true,
"multi_touch": true,
"trackpad": true,
"switches": true
}
}
Debian Image Building¶
The build system creates Debian VM images from scratch:
packages/modules/Virtualization/build/debian/
├── build.sh # Main build script
├── build_custom_kernel.sh # Custom kernel build
├── fai/ # FAI (Fully Automatic Installation) configs
│ └── config/ # Debian Bookworm/Trixie profiles
├── localdebs/ # Custom .deb packages
├── ttyd/ # Terminal-over-web support
└── vm_config.json # VM configuration template
Supported architectures: amd64, arm64, ppc64el, riscv64
The resulting image includes a Linux kernel, initrd, and a writable root
partition with Debian userspace. The VM uses cpu_topology: "match_host"
to expose the device's actual CPU topology to the guest.
54.16.7 Feature Flags¶
Linux VM GUI support is gated behind aconfig feature flags:
// Source: packages/modules/Virtualization/build/avf_flags.aconfig:14-18
flag {
name: "terminal_gui_support"
namespace: "virtualization"
description: "Enable GUI display feature in terminal app"
}
// Source: packages/modules/Virtualization/build/avf_flags.aconfig:22-27
flag {
name: "terminal_storage_balloon"
namespace: "virtualization"
description: "Enable storage ballooning for sparse disk support"
}
When terminal_gui_support is disabled, the TerminalApp falls back to a
text-only terminal (ttyd over WebView) instead of the full graphical display.
54.16.8 Virtio GPU Capabilities¶
The crosvm virtio-gpu implementation supports multiple capability sets that determine how the guest GPU driver communicates:
// Source: external/crosvm/devices/src/virtio/gpu/protocol.rs:423
VIRTIO_GPU_CAPSET_CROSS_DOMAIN = 0x5 // Cross-domain buffer sharing
| Capability | Purpose |
|---|---|
| VIRGL | Virgl3D — OpenGL command forwarding |
| GFXSTREAM | Gfxstream — Vulkan/GLES command forwarding |
| CROSS_DOMAIN | Cross-domain buffer sharing (host ↔ guest) |
Feature flags on the virtio-gpu device:
| Feature | Description |
|---|---|
RESOURCE_BLOB |
Blob memory resources (zero-copy buffers) |
FENCE_PASSING |
Synchronization fence forwarding |
CONTEXT_INIT |
Context initialization with capability selection |
RESOURCE_UUID |
UUID-based buffer identification |
The cross-domain capability enables direct sharing of AHardwareBuffers between the Android host and the Linux guest, allowing the guest's display output to appear in Android's SurfaceFlinger composition without extra copies.
54.16.9 Use Cases¶
Desktop Linux on Android Devices¶
The primary use case is running a full Linux desktop environment on Android tablets and foldables. Developers can use familiar Linux tools (VS Code, terminal, compilers) alongside Android apps:
graph LR
subgraph Device["Android Device"]
ANDROID["Android Apps<br/>(Play Store, Settings)"]
LINUX["Linux VM<br/>(Debian Desktop, VS Code,<br/>Terminal, Browser)"]
ANDROID -.->|"Shared network"| LINUX
end
Development Environment¶
Running native Linux development tools on Android hardware without dual-boot or external machines — compilers, IDEs, container runtimes, and databases run in the isolated VM with near-native performance via gfxstream GPU acceleration.
Secure Isolation¶
The Linux VM runs under pKVM's Stage-2 page table protection (see section 54.4), ensuring that a compromised guest cannot access Android's memory or vice versa. This provides stronger isolation than containers.
54.17 Security Analysis¶
54.17.1 Trust Boundaries¶
AVF defines clear trust boundaries between components:
graph TB
subgraph "Fully Trusted"
HW["Device Hardware"]
ROM["ROM / UDS"]
PKVM["pKVM Hypervisor"]
PVMFW["pvmfw"]
end
subgraph "Partially Trusted (after attestation)"
GUEST_KERNEL["Microdroid Kernel"]
GUEST_OS["Microdroid OS"]
PAYLOAD["VM Payload"]
end
subgraph "Untrusted"
HOST_KERNEL["Host Linux Kernel"]
CROSVM_HOST["crosvm"]
HOST_APPS["Host Applications"]
end
ROM -->|"DICE chain"| PKVM
PKVM -->|"loads & protects"| PVMFW
PVMFW -->|"verifies"| GUEST_KERNEL
GUEST_KERNEL --> GUEST_OS
GUEST_OS --> PAYLOAD
HOST_KERNEL -.->|"cannot access\nguest memory"| GUEST_KERNEL
CROSVM_HOST -.->|"cannot access\nguest secrets"| PVMFW
54.17.2 Attack Surface Analysis¶
Host-to-guest attacks (mitigated by pKVM):
- Direct memory access: Blocked by Stage-2 page tables
- DMA attacks: Blocked by IOMMU and MMIO guard
- Side channels: Partially mitigated by cache/TLB isolation
VMM-to-guest attacks (mitigated by pvmfw):
- Malicious device tree: Sanitized by pvmfw using template FDT
- Fake devices: MMIO guard limits accessible devices
- Rollback attacks: Multiple RBP strategies prevent secret reuse
Guest-to-host attacks (mitigated by crosvm sandboxing):
- Device escape: Process-per-device with seccomp + namespaces
- Virtio attacks: Each device has minimal syscall allowlist
- Resource exhaustion: Memory limits, CPU quotas
54.17.3 Rust Safety Guarantees¶
Both pvmfw and crosvm are written in Rust, providing:
- Memory safety -- No buffer overflows, use-after-free, or double-free
- Thread safety -- Data races prevented at compile time
- No undefined behavior -- Except in explicitly marked
unsafeblocks - Zero-cost abstractions -- Safety without runtime overhead
The pvmfw codebase uses #![no_std] to minimize the trusted computing base,
and unsafe blocks are limited to:
- Hardware register access
- Assembly instructions (HVC calls, memory barriers)
- Raw pointer manipulation for FDT parsing
- Inter-stage memory handoff
54.17.4 DICE Chain Integrity¶
The DICE chain provides cryptographic binding between boot stages. Key derivation follows the Open DICE specification:
Requirements from packages/modules/Virtualization/docs/pvm_dice_chain.md:
- KDF: You must use HKDF-SHA-512, as specified in RFC 5869.
- KDF_ASYM: You must use one of the following supported algorithms:
- Ed25519
- ECDSA with NIST P-256 (RFC 6979)
- ECDSA with NIST P-384 (RFC 6979)
Any mismatch in key derivation between the vendor's bootloader and pvmfw breaks the certificate chain, causing remote attestation, Secretkeeper, and Trusted HAL authentication to fail.
54.18 Performance Considerations¶
54.18.1 Memory Overhead¶
Each VM requires:
- Microdroid base -- ~256 MiB minimum (configurable)
- pvmfw -- ~256 KiB heap + 48 KiB stack
- crosvm overhead -- Per-device process memory
- Page tables -- Stage-2 tables for the guest
54.18.2 Huge Pages¶
AVF supports transparent huge pages (THP) for improved memory performance:
/// Ask the kernel for transparent huge-pages (THP). This is only a hint
/// and the kernel will allocate THP-backed memory only if globally enabled
/// by the system and if any can be found.
#[arg(short, long)]
hugepages: bool,
54.18.3 CPU Topology¶
The --cpu-topology option controls vCPU allocation:
fn parse_cpu_topology(s: &str) -> Result<CpuTopology, String> {
match s {
"one_cpu" => Ok(CpuTopology::CpuCount(1)),
"match_host" => Ok(CpuTopology::MatchHost(true)),
_ if s.starts_with("cpu_count=") => {
let val = s.strip_prefix("cpu_count=").unwrap();
Ok(CpuTopology::CpuCount(val.parse().map_err(|e|
format!("Invalid CPU Count: {e}"))?))
}
_ => Err(format!("Invalid cpu topology {s}")),
}
}
match_host mirrors the host's CPU topology in the guest, which is essential
for performance-sensitive workloads and correct NUMA behavior.
54.18.4 I/O Performance Tuning¶
Microdroid applies several I/O optimizations in init.rc:
# Disable proactive compaction
write /proc/sys/vm/compaction_proactiveness 0
# Disable dm-verity prefetch (reduces I/O)
write /sys/module/dm_verity/parameters/prefetch_cluster 0
# Maximize swappiness
write /proc/sys/vm/swappiness 100
# Increase watermark scale factor for memory reclaim
write /proc/sys/vm/watermark_scale_factor 600
54.19 Vsock Communication¶
54.19.1 Overview¶
AVF uses vsock (Virtual Machine Sockets) for communication between the host and guest VMs. Vsock provides a socket interface similar to TCP/UDP but operates over a virtual transport that does not require network configuration.
54.19.2 CID Assignment¶
Each VM receives a unique CID (Context ID) for vsock addressing. The VirtualizationService manages CID allocation:
const GUEST_CID_MIN: Cid = 2048;
const GUEST_CID_MAX: Cid = 65535;
const SYSPROP_LAST_CID: &str = "virtualizationservice.state.last_cid";
Special CID values:
VMADDR_CID_HYPERVISOR(0) -- The hypervisorVMADDR_CID_LOCAL(1) -- Local loopbackVMADDR_CID_HOST(2) -- The host- 2048-65535 -- Guest VMs managed by VirtualizationService
54.19.3 Communication Channels¶
AVF uses vsock for several internal communication channels:
graph LR
subgraph "Guest VM"
MM["microdroid_manager"]
PAYLOAD["VM Payload"]
ADBD["adbd"]
end
subgraph "Host"
VS["VirtualizationService"]
VIRTMGR["virtmgr"]
ADB["adb"]
end
MM <-->|"vsock: lifecycle\ncallbacks"| VIRTMGR
PAYLOAD <-->|"vsock: Binder RPC"| VS
ADBD <-->|"vsock: 5555"| ADB
MM <-->|"vsock: tombstones"| VS
54.19.4 Binder Over Vsock¶
The VM Payload API allows hosting Binder RPC servers over vsock:
// Host a Binder server in the VM, accessible from the host
void AVmPayload_runVsockRpcServer(
AIBinder* service,
unsigned int port,
AVmPayload_VsockRpcServerCallback onReady,
void* param);
This enables structured RPC communication between the host app and VM payload without requiring a network stack.
54.20 Encrypted Storage¶
54.20.1 Architecture¶
Microdroid provides encrypted persistent storage for VMs that need to retain data across reboots. The storage is backed by a host-side file but encrypted with keys derived from the VM's DICE chain.
graph TB
subgraph "Host"
FILE["Encrypted store file\n(/data/...)"]
end
subgraph "crosvm"
VIRTIO_BLK["virtio-blk\n(encrypted store disk)"]
end
subgraph "Microdroid"
DM_CRYPT["dm-crypt"]
MOUNT["/mnt/encryptedstore"]
MM["microdroid_manager"]
end
FILE --> VIRTIO_BLK
VIRTIO_BLK --> DM_CRYPT
DM_CRYPT --> MOUNT
MM -->|"derive key\nfrom DICE CDI_Seal"| DM_CRYPT
54.20.2 Key Derivation¶
The encryption key is derived from the VM's CDI_Seal value, which is part of
the DICE chain. This ensures that:
- Only the same VM (same code, same configuration) can decrypt the data
- A different VM instance cannot access another instance's data
- A rolled-back VM version cannot access data from a newer version
- The host cannot decrypt the data (it never sees the key)
54.20.3 Storage Lifecycle¶
sequenceDiagram
participant App as Host App
participant VS as VirtualizationService
participant CV as crosvm
participant MM as microdroid_manager
participant FS as Encrypted Store
App->>VS: Create VM with encryptedStorageImage
VS->>CV: Pass storage file as virtio-blk disk
CV->>MM: VM boots, disk available
MM->>MM: Derive encryption key from CDI_Seal
MM->>FS: Setup dm-crypt on virtio-blk device
MM->>FS: Mount at /mnt/encryptedstore
MM->>MM: Set microdroid_manager.encrypted_store.status=mounted
Note over MM,FS: init.rc restorecon and tuning
MM->>MM: Set microdroid_manager.encrypted_store.status=ready
Note over MM,FS: Payload can now use /mnt/encryptedstore
54.20.4 Storage Size Management¶
Storage can be pre-allocated or resized:
let storage = if let Some(ref path) = config.storage {
if !path.exists() {
command_create_partition(
service,
path,
config.microdroid.storage_size.unwrap_or(10 * 1024 * 1024),
PartitionType::ENCRYPTEDSTORE,
)?;
} else if let Some(storage_size) = config.microdroid.storage_size {
set_encrypted_storage(service, path, storage_size)?;
}
Some(open_parcel_file(path, true)?)
} else {
None
};
Default size is 10 MiB, configurable via --storage-size.
54.21 Updatable VMs and Secretkeeper¶
54.21.1 The Update Problem¶
When a VM's code is updated, the DICE chain changes because the code measurements are different. This means the CDI values change, and any data encrypted with the old CDI cannot be decrypted by the new version.
54.21.2 Secretkeeper Protocol¶
Secretkeeper solves this by providing a secure key-value store that persists across VM updates. The VM stores its secrets in Secretkeeper rather than encrypting them directly with DICE-derived keys.
sequenceDiagram
participant VM_v1 as VM (version 1)
participant SK as Secretkeeper HAL
participant VM_v2 as VM (version 2)
Note over VM_v1,SK: Initial provisioning
VM_v1->>SK: Store secret (key=vm_id, value=data_key)
SK->>SK: Verify VM identity via DICE chain
SK->>SK: Store encrypted with platform key
Note over VM_v2,SK: After update
VM_v2->>SK: Retrieve secret (key=vm_id)
SK->>SK: Verify VM identity (new DICE chain)
SK->>SK: Check rollback protection
SK-->>VM_v2: Return data_key
VM_v2->>VM_v2: Decrypt persistent data with data_key
The pvmfw integration handles Secretkeeper-capable VMs:
if verified_boot_data.has_capability(Capability::SecretkeeperProtection) {
perform_deferred_rollback_protection(verified_boot_data)?;
Ok((false, instance_hash.unwrap(), true))
}
54.21.3 VM Reference DT for Secretkeeper¶
The VM reference DT (pvmfw config version 1.2) provides a mechanism to securely pass the Secretkeeper public key to VMs:
Use-cases of VM reference DT include:
- Passing the public key of the Secretkeeper HAL implementation to each VM.
- Passing the vendor hashtree digest to run Microdroid with verified vendor image.
The bootloader adds the Secretkeeper public key to the host device tree under
/avf/reference/, and pvmfw validates that if the same property appears in the
VM's device tree, its value matches the reference.
54.22 Early VM (Boot-Time VMs)¶
54.22.1 Concept¶
AVF supports early VMs that start during device boot, before the full Android
userspace is available. These are documented in
packages/modules/Virtualization/docs/early_vm.md.
Early VMs are used for:
- Security-critical services that must be available from first boot
- TEE services that need to start before Android init completes
- Hardware initialization that requires a trusted execution environment
54.22.2 Boot Sequence Integration¶
graph TB
ABL["Android Bootloader"] --> KERNEL["Linux Kernel Boot"]
KERNEL --> PKVM["pKVM Initialization"]
PKVM --> EARLY_VM["Early VM Start"]
EARLY_VM --> INIT["Android init"]
INIT --> VS["VirtualizationService"]
VS --> REGULAR_VM["Regular VM Start"]
54.23 Debugging Deep Dive¶
54.23.1 Debug Policy¶
The debug policy controls what debugging features are available for protected VMs. It is passed as a DTBO in the pvmfw configuration data (entry 1).
The debug policy is only applied when the DICE chain indicates debug mode:
// The bootloader should never pass us a debug policy when the boot is secure
if debug_policy.is_some() && !dice_debug_mode {
warn!("Ignoring debug policy, DICE handover does not indicate Debug mode");
debug_policy = None;
}
54.23.2 Debug Levels¶
The vm CLI supports two debug levels:
fn parse_debug_level(s: &str) -> Result<DebugLevel, String> {
match s {
"none" => Ok(DebugLevel::NONE),
"full" => Ok(DebugLevel::FULL),
_ => Err(format!("Invalid debug level {s}")),
}
}
none-- Production mode. No console output, no logging, no ADB.full-- Debug mode. Console output, logging, ADB access in Microdroid.
54.23.3 Early Console (earlycon)¶
For debugging early boot issues, earlycon can be enabled to get kernel output before the normal console driver initializes:
if config.debug.enable_earlycon() {
if cfg!(target_arch = "aarch64") {
custom_config.extraKernelCmdlineParams
.push(String::from("earlycon=uart8250,mmio,0x3f8"));
} else if cfg!(target_arch = "x86_64") {
custom_config.extraKernelCmdlineParams
.push(String::from("earlycon=uart8250,io,0x3f8"));
}
custom_config.extraKernelCmdlineParams
.push(String::from("keep_bootcon"));
}
For protected VMs, pvmfw controls UART access. Debuggable payloads keep UART mapped after pvmfw hands off:
// Keep UART MMIO_GUARD-ed for debuggable payloads, to enable earlycon.
let keep_uart = cfg!(debuggable_vms_improvements) && debuggable_payload;
54.23.4 GDB Debugging¶
crosvm supports GDB remote debugging of the guest kernel:
/// Port at which crosvm will start a gdb server to debug guest kernel.
/// Note: this is only supported on Android kernels android14-5.15 and higher.
#[arg(long)]
gdb: Option<NonZeroU16>,
Usage:
# Start VM with GDB server
adb shell /apex/com.android.virt/bin/vm run-microdroid \
--debug full --gdb 1234
# Forward the port
adb forward tcp:1234 tcp:1234
# Connect with GDB
gdb-multiarch vmlinux -ex "target remote :1234"
54.23.5 Device Tree Dump¶
The --dump-device-tree option captures the VM's device tree for inspection:
This is useful for debugging device assignment issues or verifying the sanitized FDT that pvmfw produces.
54.23.6 VM Callback Debugging¶
The vm CLI implements callbacks that print VM lifecycle events:
struct Callback {}
impl vmclient::VmCallback for Callback {
fn on_payload_started(&self, _cid: i32) {
eprintln!("payload started");
}
fn on_payload_ready(&self, _cid: i32) {
eprintln!("payload is ready");
}
fn on_payload_finished(&self, _cid: i32, exit_code: i32) {
eprintln!("payload finished with exit code {exit_code}");
}
fn on_error(&self, _cid: i32, error_code: ErrorCode, message: &str) {
eprintln!("VM encountered an error: code={error_code:?}, message={message}");
}
}
54.24 Testing Infrastructure¶
54.24.1 Test Suites¶
AVF includes several test suites:
| Test Suite | Purpose |
|---|---|
MicrodroidHostTestCases |
Host-side integration tests |
MicrodroidTestApp |
In-VM test application |
MicrodroidTests |
DICE chain validation, boot verification |
| pvmfw unit tests | Firmware-level unit tests |
| crosvm e2e tests | End-to-end VM tests |
| VTS tests | Vendor test suite for HAL compliance |
54.24.2 DICE Chain Validation Test¶
The protectedVmHasValidDiceChain test verifies:
- All DICE chain fields conform to the Android Profile for DICE
- The chain is a valid certificate chain where each certificate's subject public key verifies the next certificate's signature
From packages/modules/Virtualization/docs/pvm_dice_chain.md:
The test retrieves the DICE chain from within a Microdroid VM in protected mode and checks the following properties using the hwtrust library.
54.24.3 Running Specific Tests¶
# Run all Microdroid host tests
atest MicrodroidHostTestCases
# Run specific DICE chain test
atest MicrodroidTests#protectedVmHasValidDiceChain
# Run with verbose output
atest MicrodroidHostTestCases -v
# Run VTS tests for capabilities HAL
atest VtsHalVirtualizationCapabilitiesTargetTest
54.24.4 Test VM Configuration¶
Tests use the EmptyPayloadApp as a baseline VM payload:
fn find_empty_payload_apk_path() -> Result<PathBuf, Error> {
const GLOB_PATTERN: &str =
"/apex/com.android.virt/app/**/EmptyPayloadApp*.apk";
let mut entries: Vec<PathBuf> = glob(GLOB_PATTERN)
.context("failed to glob")?
.filter_map(|e| e.ok())
.collect();
match entries.pop() {
Some(path) => Ok(path),
None => Err(anyhow!("No apks match {}", GLOB_PATTERN)),
}
}
54.25 Build System Integration¶
54.25.1 APEX Build¶
The com.android.virt APEX is built using the banchan build target:
54.25.2 Microdroid Image Build¶
The Microdroid system image is built as part of the APEX. The build configuration
files are at packages/modules/Virtualization/build/microdroid/:
microdroid.json-- VM configuration templateinit.rc-- Init process configurationfstab.microdroid-- Filesystem mount tablebuild.prop-- System propertiescgroups.json-- Cgroup configurationbootconfig.*-- Architecture-specific boot configsmicrodroid_manifest.xml-- Android manifestmicrodroid_group/microdroid_passwd-- User/group definitions
54.25.3 pvmfw Build¶
pvmfw is built as a bare-metal binary using the vmbase infrastructure:
packages/modules/Virtualization/guest/pvmfw/
Android.bp # Build rules
src/ # Rust source code
platform_arm64.dts # ARM64 device tree source
platform_x86_64.dts # x86_64 device tree source
avb/ # AVB verification keys
testdata/ # Test data
The build produces pvmfw.bin, which is included in the APEX and optionally
written to a dedicated pvmfw partition on the device.
54.25.4 Product Configuration¶
To enable AVF in a product, add to the product makefile:
For devices with protected VM support, additional configuration may be needed:
54.26 Feature Flags and Conditional Compilation¶
54.26.1 Cargo Feature Flags in pvmfw¶
pvmfw uses Rust cfg attributes to conditionally compile features based on the
target platform:
// instance.img-based rollback protection
} else if cfg!(feature = "instance-img") {
perform_legacy_rollback_protection(fdt, dice_inputs, cdi_seal, instance_hash)
}
// Legacy raw DICE handover compatibility (Android T)
Err(config::Error::InvalidMagic) if cfg!(feature = "compat-raw-dice-handover") => {
warn!("Assuming the appended data to be a raw DICE handover");
Some(Self::LegacyDiceHandover(&mut data[..DICE_CHAIN_SIZE]))
}
// Debuggable VM improvements
let keep_uart = cfg!(debuggable_vms_improvements) && debuggable_payload;
// DICE chain changes
let bytes_for_next = if cfg!(dice_changes) {
Cow::Borrowed(bytes)
} else {
Cow::Owned(truncated_bytes)
};
54.26.2 Build-Time Feature Flags in the vm CLI¶
The vm CLI uses cfg blocks to gate features that may not be available on
all platforms:
// Network support
#[cfg(network)]
#[arg(short, long)]
network_supported: bool,
// Vendor modules
#[cfg(vendor_modules)]
#[arg(long)]
vendor: Option<PathBuf>,
// Device assignment
#[cfg(device_assignment)]
#[arg(long)]
devices: Vec<PathBuf>,
// TEE services allowlist
#[cfg(tee_services_allowlist)]
#[arg(long)]
tee_services: Vec<String>,
// Debuggable VM improvements
#[cfg(debuggable_vms_improvements)]
#[arg(long)]
enable_earlycon: bool,
// VM-to-host services
#[cfg(vm_to_host_services)]
#[arg(long)]
host_services: Vec<String>,
Each feature flag is accompanied by a runtime accessor that returns a default value when the feature is not compiled in:
impl CommonConfig {
fn network_supported(&self) -> bool {
cfg_if::cfg_if! {
if #[cfg(network)] {
self.network_supported
} else {
false
}
}
}
}
54.26.3 VirtualizationService Feature Flags¶
The VirtualizationService uses cfg for the LLPVM (Long-Lived Protected VM)
maintenance service:
if cfg!(llpvm_changes) {
let maintenance_service =
BnVirtualizationMaintenance::new_binder(
service.clone(), BinderFeatures::default()
);
register(MAINTENANCE_SERVICE_NAME, maintenance_service)?;
}
54.26.4 crosvm Feature Flags¶
crosvm uses Cargo features extensively to control optional components:
#[cfg(feature = "composite-disk")]
use disk::create_composite_disk;
#[cfg(feature = "qcow")]
use disk::QcowFile;
#[cfg(feature = "gpu")]
use devices::virtio::vhost::user::device::run_gpu_device;
#[cfg(feature = "net")]
use devices::virtio::vhost::user::device::run_net_device;
#[cfg(feature = "audio")]
use devices::virtio::vhost::user::device::run_snd_device;
#[cfg(feature = "balloon")]
use vm_control::BalloonControlCommand;
#[cfg(feature = "pci-hotplug")]
use vm_control::client::do_net_add;
#[cfg(feature = "scudo")]
#[global_allocator]
static ALLOCATOR: scudo::GlobalScudoAllocator = scudo::GlobalScudoAllocator;
For Android builds, the scudo allocator is enabled for hardened memory
allocation, and GPU/audio features are typically disabled since Microdroid
VMs are headless.
54.27 Comparison with Other Virtualization Solutions¶
54.27.1 AVF vs Traditional Hypervisors¶
| Aspect | AVF/pKVM | Type-1 Hypervisor (e.g., Xen) | Type-2 (e.g., QEMU/KVM) |
|---|---|---|---|
| TCB size | Minimal (pKVM at EL2) | Large (full hypervisor) | Very large (host OS + QEMU) |
| Host trust | Untrusted (for pVMs) | Partially trusted | Fully trusted |
| Memory isolation | Stage-2 enforced | Stage-2 enforced | Stage-2 enforced |
| DICE attestation | Built-in | Not standard | Not standard |
| Device model | crosvm (Rust, sandboxed) | Various | QEMU (C, monolithic) |
| Guest OS | Microdroid (minimal Android) | Any | Any |
| Primary use case | Confidential mobile compute | Server virtualization | Desktop/server VMs |
54.27.2 AVF vs ARM CCA¶
ARM Confidential Compute Architecture (CCA) introduces Realms as a hardware feature for confidential computing. pKVM is designed to be compatible with CCA where available:
graph TB
subgraph "Current (pKVM)"
EL2_PKVM["EL2: pKVM Hypervisor"]
NS_HOST["Non-Secure: Host"]
NS_GUEST["Non-Secure: Protected VM"]
end
subgraph "Future (ARM CCA)"
EL2_RMM["EL2: Realm Management Monitor"]
NS_HOST2["Non-Secure: Host"]
REALM["Realm: Protected VM"]
end
The pvmfw README acknowledges this forward compatibility:
The pVM concept is not Google-exclusive. Partner-defined VMs (SoC/OEM) meeting isolation/memory access restrictions are also pVMs.
Summary¶
The Android Virtualization Framework represents a fundamental shift in Android's security architecture, bringing hardware-backed confidential computing to mobile devices. The key components work together to create a complete virtualization ecosystem:
- pKVM at EL2 provides the foundational memory isolation guarantee
- pvmfw establishes the root of trust within each protected VM
- crosvm manages the virtual machine with per-device sandboxing
- Microdroid provides a minimal Android runtime for VM payloads
- VirtualizationService orchestrates the entire lifecycle from userspace
- DICE attestation provides a cryptographic chain of trust from ROM to payload
The framework is designed with defense in depth: even if the host kernel is compromised, a protected VM's secrets remain safe. The Rust implementation of both crosvm and pvmfw provides memory safety guarantees in the most security-critical components.
Key Source Paths¶
| Component | Path |
|---|---|
| AVF Module | packages/modules/Virtualization/ |
| VirtualizationService | packages/modules/Virtualization/android/virtualizationservice/ |
| virtmgr | packages/modules/Virtualization/android/virtmgr/ |
| vm CLI | packages/modules/Virtualization/android/vm/ |
| composd | packages/modules/Virtualization/android/composd/ |
| pvmfw | packages/modules/Virtualization/guest/pvmfw/ |
| Service VM | packages/modules/Virtualization/guest/service_vm/ |
| Microdroid build | packages/modules/Virtualization/build/microdroid/ |
| VM Payload API | packages/modules/Virtualization/libs/libvm_payload/ |
| Java API | packages/modules/Virtualization/libs/framework-virtualization/ |
| crosvm | external/crosvm/ |
| VM Capabilities HAL | hardware/interfaces/virtualization/capabilities_service/ |
| DICE chain docs | packages/modules/Virtualization/docs/pvm_dice_chain.md |
| Remote attestation docs | packages/modules/Virtualization/docs/vm_remote_attestation.md |
| Shutdown docs | packages/modules/Virtualization/docs/shutdown.md |
| Device assignment docs | packages/modules/Virtualization/docs/device_assignment.md |
The Android Virtualization Framework represents a fundamental shift in Android's security architecture, bringing hardware-backed confidential computing to mobile devices. The key components work together to create a complete virtualization ecosystem:
- pKVM at EL2 provides the foundational memory isolation guarantee
- pvmfw establishes the root of trust within each protected VM
- crosvm manages the virtual machine with per-device sandboxing
- Microdroid provides a minimal Android runtime for VM payloads
- VirtualizationService orchestrates the entire lifecycle from userspace
- DICE attestation provides a cryptographic chain of trust from ROM to payload
The framework is designed with defense in depth: even if the host kernel is compromised, a protected VM's secrets remain safe. The Rust implementation of both crosvm and pvmfw provides memory safety guarantees in the most security-critical components.
Key Source Paths¶
| Component | Path |
|---|---|
| AVF Module | packages/modules/Virtualization/ |
| VirtualizationService | packages/modules/Virtualization/android/virtualizationservice/ |
| virtmgr | packages/modules/Virtualization/android/virtmgr/ |
| vm CLI | packages/modules/Virtualization/android/vm/ |
| composd | packages/modules/Virtualization/android/composd/ |
| pvmfw | packages/modules/Virtualization/guest/pvmfw/ |
| Service VM | packages/modules/Virtualization/guest/service_vm/ |
| Microdroid build | packages/modules/Virtualization/build/microdroid/ |
| VM Payload API | packages/modules/Virtualization/libs/libvm_payload/ |
| Java API | packages/modules/Virtualization/libs/framework-virtualization/ |
| crosvm | external/crosvm/ |
| VM Capabilities HAL | hardware/interfaces/virtualization/capabilities_service/ |
| DICE chain docs | packages/modules/Virtualization/docs/pvm_dice_chain.md |
| Remote attestation docs | packages/modules/Virtualization/docs/vm_remote_attestation.md |
| Shutdown docs | packages/modules/Virtualization/docs/shutdown.md |
| Device assignment docs | packages/modules/Virtualization/docs/device_assignment.md |