Chapter 7: Bionic and the Dynamic Linker¶
Android does not use the GNU C Library (glibc). Instead, it relies on Bionic, a custom C library designed from the ground up for mobile devices. This chapter performs a deep, source-level walkthrough of Bionic's architecture, its system call interface, the dynamic linker that loads every native binary on Android, and the VNDK namespace isolation that enforces the Treble architecture boundary at the library-loading level.
Every native process on Android -- from the init daemon that boots the system to
the app you launched a moment ago -- passes through the code examined here. The
source files live under bionic/ in the AOSP tree, with supporting
infrastructure in system/linkerconfig/ and build/soong/cc/.
7.1 Bionic: Android's C Library¶
7.1.1 Why Not glibc?¶
The choice to create a new C library rather than adopt glibc was one of the earliest and most consequential decisions in Android's history. The reasons are both legal and technical:
-
Licensing. glibc is licensed under the LGPL. While the LGPL permits dynamic linking without imposing copyleft obligations on the calling code, the Android team wanted to avoid any ambiguity for device manufacturers and app developers. Bionic is licensed under the three-clause BSD license, which imposes essentially no restrictions on downstream use.
-
Size. glibc is designed for general-purpose Linux systems. It supports dozens of locales, extensive internationalization machinery, NSS (Name Service Switch) modules, and a rich set of GNU extensions. On a mobile device with constrained flash storage and RAM, this overhead is unwelcome. Bionic strips away everything that Android does not need.
-
Startup speed. Every Android application starts as a fork of the Zygote process, and many native daemons launch during boot. The time to perform dynamic linking and C library initialization is multiplied by hundreds of processes. Bionic is designed for fast startup: its dynamic linker is lean, its initialization path is short, and its thread-local storage (TLS) layout is fixed at compile time rather than computed at runtime.
-
Android-specific features. Bionic integrates directly with Android's property system, its logging infrastructure (liblog), its security model (seccomp-BPF filters applied at Zygote fork), and its memory allocator (Scudo). These integrations would require extensive patching of glibc.
-
Thread model. Bionic's pthread implementation is tightly coupled to the Linux kernel's threading primitives (clone, futex, robust mutexes) and omits features like POSIX thread cancellation that Android does not use.
7.1.2 Source Tree Layout¶
The Bionic C library source lives at:
The directory contains 38 top-level entries. The most important are:
| Directory | Purpose |
|---|---|
bionic/ |
Core C library implementations (261 .cpp files) |
arch-arm/ |
ARM 32-bit assembly and architecture-specific code |
arch-arm64/ |
AArch64 assembly, IFUNC resolvers, Oryon optimizations |
arch-x86/ |
x86 32-bit code |
arch-x86_64/ |
x86-64 code |
arch-riscv64/ |
RISC-V 64-bit code |
arch-common/ |
Architecture-independent assembly helpers |
include/ |
Public C library headers exposed to the NDK |
kernel/ |
Sanitized Linux kernel headers |
private/ |
Internal headers shared between libc and the linker |
seccomp/ |
Seccomp-BPF policy generation and installation |
stdio/ |
Standard I/O implementation |
dns/ |
DNS resolver (a stripped-down NetBSD resolver) |
upstream-freebsd/ |
Code imported from FreeBSD |
upstream-netbsd/ |
Code imported from NetBSD |
upstream-openbsd/ |
Code imported from OpenBSD |
async_safe/ |
Async-signal-safe logging and formatting |
system_properties/ |
Android property system client |
tools/ |
Code generation scripts (gensyscalls.py, genseccomp.py) |
tzcode/ |
Timezone handling (from IANA tz database) |
platform/ |
Platform-specific headers |
memory/ |
Memory tagging support (MTE) |
7.1.3 Core Library: bionic/libc/bionic/¶
The bionic/libc/bionic/ directory is the heart of the C library. It contains
261 source files implementing everything from malloc() to pthread_create().
Key files include:
Process initialization:
libc_init_common.cpp-- Common initialization for static and dynamic executableslibc_init_dynamic.cpp-- Initialization path for dynamically-linked executableslibc_init_static.cpp-- Initialization path for statically-linked executables
Threading:
pthread_create.cpp-- Thread creationpthread_mutex.cpp-- Mutex implementation (uses Linux futexes)pthread_cond.cpp-- Condition variablespthread_rwlock.cpp-- Reader-writer lockspthread_internal.h-- Internal thread state structures
Memory allocation:
malloc_common.cpp-- Dispatch layer for the allocator
From bionic/libc/bionic/malloc_common.cpp (lines 67-77):
extern "C" void* calloc(size_t n_elements, size_t elem_size) {
auto dispatch_table = GetDispatchTable();
if (__predict_false(dispatch_table != nullptr)) {
return MaybeTagPointer(dispatch_table->calloc(n_elements, elem_size));
}
void* result = Malloc(calloc)(n_elements, elem_size);
if (__predict_false(result == nullptr)) {
warning_log("calloc(%zu, %zu) failed: returning null pointer", n_elements, elem_size);
}
return MaybeTagPointer(result);
}
This dispatch pattern is fundamental to Bionic's memory allocation architecture.
The GetDispatchTable() call checks whether a debug malloc or profiling malloc
has been installed. If so, the call is redirected. Otherwise, it falls through
to Scudo (the default allocator) via the Malloc() macro. The
MaybeTagPointer() call implements MTE (Memory Tagging Extension) pointer
tagging on hardware that supports it.
System call wrappers:
clone.cpp,exec.cpp,fork.cpp-- Process managementsocket.cpp,accept.cpp-- Network I/O
String and memory operations:
- Architecture-optimized via IFUNC (Indirect Function) dispatch
Dynamic library support:
dl_iterate_phdr_static.cpp--dl_iterate_phdrfor static executablesdlfcn.cpp--dlopen/dlsym/dlclosewrappers
7.1.4 Process Initialization¶
When a dynamically-linked executable starts, the kernel maps the executable and
the dynamic linker (see Section 6.3). The linker performs relocation, then
calls libc's .preinit_array entry __libc_preinit. This function, defined in
bionic/libc/bionic/libc_init_dynamic.cpp, runs before any other shared
library initializer:
From bionic/libc/bionic/libc_init_dynamic.cpp (lines 29-42):
/*
* This source files provides two important functions for dynamic
* executables:
*
* - a C runtime initializer (__libc_preinit), which is called by
* the dynamic linker when libc.so is loaded. This happens before
* any other initializer (e.g. static C++ constructors in other
* shared libraries the program depends on).
*
* - a program launch function (__libc_init), which is called after
* all dynamic linking has been performed.
*/
The initialization sequence is:
sequenceDiagram
participant Kernel
participant Linker as Dynamic Linker
participant LibC as libc.so
participant App as Application
Kernel->>Linker: Map ELF, transfer control
Linker->>Linker: Self-relocate
Linker->>Linker: Load dependencies (BFS)
Linker->>Linker: Relocate all libraries
Linker->>LibC: Call __libc_preinit()
LibC->>LibC: Init TLS, stack guard, properties
Linker->>Linker: Call .init_array for all libs
Linker->>App: Jump to entry point
App->>LibC: __libc_init()
LibC->>App: Call main()
The __libc_preinit_impl function performs these critical steps:
- TLS generation synchronization -- Registers libc's copy of the TLS generation counter with the linker so TLS modules stay in sync.
- Global variable initialization -- Sets up
__libc_globals, a write-protected structure containing the allocator dispatch table. - Common initialization -- Calls
__libc_init_common()which initializes the system properties client, sets up theenvironpointer, and configures the heap allocator. - Netd client initialization -- Registers DNS resolution hooks.
- Callback registration -- Provides the linker with callbacks for HWASan library load/unload events and MTE stack remapping.
From bionic/libc/bionic/libc_init_common.cpp (lines 58-61):
__LIBC_HIDDEN__ constinit WriteProtected<libc_globals> __libc_globals;
__LIBC_HIDDEN__ constinit _Atomic(bool) __libc_memtag_stack;
__LIBC_HIDDEN__ constinit bool __libc_memtag_stack_abi;
The WriteProtected<> template maps the globals structure into memory that is
normally read-only. Modifications require explicitly acquiring a
ProtectedDataGuard, which temporarily remaps the page as writable. This
defends against corruption of critical data like the allocator dispatch table.
7.1.5 Thread-Local Storage and the Bionic TCB¶
Bionic's TLS implementation is tightly integrated with the kernel. Each thread
has a Thread Control Block (TCB) accessible via a dedicated register
(TPIDR_EL0 on AArch64, GS segment on x86-64). The TCB layout is defined in
bionic/libc/private/bionic_tls.h.
From bionic/libc/bionic/pthread_create.cpp (lines 62-71):
__attribute__((no_stack_protector))
void __init_tcb_stack_guard(bionic_tcb* tcb) {
// GCC looks in the TLS for the stack guard on x86, so copy it there
// from our global.
tcb->tls_slot(TLS_SLOT_STACK_GUARD) = reinterpret_cast<void*>(__stack_chk_guard);
}
void __init_bionic_tls_ptrs(bionic_tcb* tcb, bionic_tls* tls) {
tcb->thread()->bionic_tcb = tcb;
tcb->thread()->bionic_tls = tls;
tcb->tls_slot(TLS_SLOT_BIONIC_TLS) = tls;
}
Key TLS slots include:
| Slot | Purpose |
|---|---|
TLS_SLOT_SELF |
Pointer to the TCB itself |
TLS_SLOT_THREAD_ID |
Thread ID for fast gettid() |
TLS_SLOT_STACK_GUARD |
Stack canary for -fstack-protector |
TLS_SLOT_BIONIC_TLS |
Pointer to the full bionic_tls structure |
TLS_SLOT_DTV |
Dynamic Thread Vector for ELF TLS |
TLS_SLOT_ART |
Reserved for the Android Runtime |
This fixed layout means that accessing thread-local state requires no function calls or hash table lookups -- just a register read and a constant offset. The stack guard canary, in particular, is accessed on every function entry and exit in stack-protected code, so its placement in a fixed TLS slot is critical for performance.
7.1.6 Architecture-Specific Optimizations¶
Bionic provides architecture-specific implementations for performance-critical functions. The most notable are the string and memory operations.
IFUNC (Indirect Function) Dispatch:
On AArch64, functions like memcpy, memset, strcmp, and strlen are
dispatched at program startup via GNU IFUNC resolvers. The resolver examines
CPU capabilities and selects the optimal implementation.
From bionic/libc/arch-arm64/ifuncs.cpp (lines 36-49, 69-79):
inline int implementer(uint64_t midr_el1) { return (midr_el1 >> 24) & 0xff; }
inline int variant(uint64_t midr_el1) { return (midr_el1 >> 20) & 0xf; }
inline int part(uint64_t midr_el1) { return (midr_el1 >> 4) & 0xfff; }
inline int revision(uint64_t midr_el1) { return (midr_el1 >> 0) & 0xf; }
static inline bool __bionic_is_oryon(unsigned long hwcap) {
if (!(hwcap & HWCAP_CPUID)) return false;
unsigned long midr;
__asm__ __volatile__("mrs %0, MIDR_EL1" : "=r"(midr));
return implementer(midr) == 'Q' && part(midr) <= 15;
}
// ...
DEFINE_IFUNC_FOR(memcpy) {
if (arg->_hwcap2 & HWCAP2_MOPS) {
RETURN_FUNC(memcpy_func_t, __memmove_aarch64_mops);
} else if (__bionic_is_oryon(arg->_hwcap)) {
RETURN_FUNC(memcpy_func_t, __memcpy_aarch64_nt);
} else if (arg->_hwcap & HWCAP_ASIMD) {
RETURN_FUNC(memcpy_func_t, __memcpy_aarch64_simd);
} else {
RETURN_FUNC(memcpy_func_t, __memcpy_aarch64);
}
}
This code reveals four memcpy implementations for AArch64:
- MOPS (Memory Operations) -- Uses the Armv8.8-A CPYFE instruction for hardware-accelerated memory copy. This is the fastest path on supported silicon.
- Oryon non-temporal -- Qualcomm Oryon cores (implementer 'Q', parts 0-15)
benefit from non-temporal stores that bypass the cache hierarchy for large
copies. The implementation is in
bionic/libc/arch-arm64/oryon/memcpy-nt.S. - ASIMD (NEON) -- Uses 128-bit SIMD load/store pairs. The standard fast path for most AArch64 devices.
- Generic -- A scalar fallback for cores that lack ASIMD (theoretical on AArch64, but present for completeness).
Similarly, memchr has MTE-aware and standard variants:
DEFINE_IFUNC_FOR(memchr) {
if (arg->_hwcap2 & HWCAP2_MTE) {
RETURN_FUNC(memchr_func_t, __memchr_aarch64_mte);
} else {
RETURN_FUNC(memchr_func_t, __memchr_aarch64);
}
}
The MTE-aware variant must handle the possibility that pointer tags in the search buffer do not match, requiring tag-stripped comparisons.
Architecture-specific assembly files:
Each architecture directory contains hand-written assembly for the most critical paths:
| Architecture | Key Assembly Files |
|---|---|
arch-arm64/bionic/ |
syscall.S, setjmp.S, vfork.S, __bionic_clone.S |
arch-arm64/string/ |
__memcpy_chk.S, __memset_chk.S |
arch-arm64/oryon/ |
memcpy-nt.S, memset-nt.S |
arch-arm/bionic/ |
Cortex-A53/A55/A7/A9/A15/Krait/Kryo-specific routines |
arch-x86_64/bionic/ |
syscall.S, setjmp.S |
arch-x86_64/string/ |
SSE/AVX-optimized string operations |
arch-riscv64/bionic/ |
syscall.S, setjmp.S |
arch-riscv64/string/ |
RISC-V string operations |
The ARM 32-bit tree is particularly rich, with CPU-specific subdirectories for
Cortex-A53, Cortex-A55, Cortex-A7, Cortex-A9, Cortex-A15, Krait (Qualcomm),
and Kryo (Qualcomm). The IFUNC resolver on ARM selects among these at runtime
based on /proc/cpuinfo or HWCAP values.
7.1.7 Upstream Code and the BSD Heritage¶
Bionic does not implement everything from scratch. It imports code from three BSD operating systems:
-
OpenBSD: Provides
strlcpy,strlcat,arc4random,reallocarray, and much of the standard string library. OpenBSD's focus on security makes it a natural source for hardened implementations. -
FreeBSD: Contributes parts of the math library (
libm), locale support, and some string functions. -
NetBSD: Provides the DNS resolver (
bionic/libc/dns/) and some miscellaneous utility functions.
Imports are kept in separate directories (upstream-openbsd/, upstream-freebsd/,
upstream-netbsd/) and are periodically updated to incorporate upstream bug
fixes and security patches.
7.1.8 The Property System Client¶
Android's property system (__system_property_get, __system_property_set)
is implemented partly in Bionic. The client-side code in
bionic/libc/system_properties/ provides lock-free reads from a shared memory
region mapped into every process. This is how every process on Android can read
system properties without IPC overhead.
The property area is initialized during __libc_init_common():
From bionic/libc/bionic/libc_init_common.cpp (line 54):
This function maps the property area file (/dev/__properties__/) and sets up
the internal data structures for property reads.
7.1.9 Bionic vs. glibc: Feature Comparison¶
| Feature | Bionic | glibc |
|---|---|---|
| License | BSD | LGPL |
| Size (stripped) | ~1 MB | ~8 MB |
| Locale support | Minimal (ASCII + UTF-8) | Full ICU-level |
| NSS modules | No | Yes |
| Thread cancellation | No | Yes |
| Stack protector | Fixed TLS slot | Variable offset |
| Default allocator | Scudo | ptmalloc2 |
dlopen from APK |
Yes (ZIP file support) | No |
android_dlopen_ext |
Yes | N/A |
| seccomp integration | Built-in | External |
| Property system | Built-in | N/A |
| FORTIFY_SOURCE | Enhanced | Standard |
7.1.10 Memory Safety Features¶
Bionic incorporates several memory safety features that have no glibc equivalent:
MTE (Memory Tagging Extension):
On Armv8.5-A and later hardware, Bionic supports MTE for both heap and stack
memory. The note_memtag_heap_async.S and note_memtag_heap_sync.S files in
arch-arm64/bionic/ contain ELF notes that request MTE for heap allocations.
Scudo hardened allocator:
Bionic's default allocator is Scudo, a security-hardened allocator that provides
guard pages, quarantine zones, and integrity checks. The dispatch mechanism in
malloc_common.cpp allows Scudo to be transparently replaced with debug
allocators.
GWP-ASan:
A sampling allocator that catches use-after-free and buffer overflow bugs in
production, integrated via gwp_asan_wrappers.h.
FORTIFY_SOURCE: Bionic's FORTIFY implementation is more aggressive than glibc's, with additional compile-time and runtime checks for buffer overflows in string and memory functions.
Tagged pointers: Even without MTE hardware, Bionic can tag the top byte of heap pointers (Top-Byte Ignore / TBI on ARM) to detect certain classes of memory corruption.
graph TD
A[malloc call] --> B{Dispatch Table?}
B -->|Debug malloc| C[Debug Allocator]
B -->|Normal| D[Scudo Allocator]
D --> E{GWP-ASan Sample?}
E -->|Yes| F[GWP-ASan Guard Page Allocation]
E -->|No| G[Scudo Normal Allocation]
G --> H{MTE Enabled?}
H -->|Yes| I[Tag Memory with Random Tag]
H -->|No| J{TBI Tagging?}
J -->|Yes| K[Tag Top Byte of Pointer]
J -->|No| L[Return Raw Pointer]
I --> L
K --> L
F --> L
C --> L
7.2 System Call Interface¶
7.2.1 How System Calls Work on Android¶
Every interaction between user-space code and the Linux kernel passes through a system call. Bionic provides the user-space half of this interface: the thin assembly stubs that transition from user mode to kernel mode, and the C wrapper functions that provide the POSIX API.
The system call interface has three layers:
graph TD
A["Application Code<br/>(e.g., open(), read())"] --> B["Bionic C Wrapper<br/>(bionic/libc/bionic/*.cpp)"]
B --> C["Assembly Stub<br/>(generated from SYSCALLS.TXT)"]
C --> D["Kernel Entry<br/>(SVC #0 on ARM64)"]
D --> E["Linux Kernel<br/>System Call Handler"]
style A fill:#e1f5fe
style B fill:#f3e5f5
style C fill:#fff3e0
style D fill:#fce4ec
style E fill:#e8f5e9
7.2.2 SYSCALLS.TXT: The System Call Definition File¶
All system call stubs in Bionic are auto-generated from a single definition file:
Source file: bionic/libc/SYSCALLS.TXT (384 lines)
From bionic/libc/SYSCALLS.TXT (lines 1-14):
# This file is used to automatically generate bionic's system call stubs.
#
# It is processed by a python script named gensyscalls.py,
# normally run via the genrules in libc/Android.bp.
#
# Each non-blank, non-comment line has the following format:
#
# func_name[|alias_list][:syscall_name[:socketcall_id]]([parameter_list]) arch_list
#
# where:
# arch_list ::= "all" | arches
# arches ::= arch | arch "," arches
# arch ::= "arm" | "arm64" | "riscv64" | "x86" | "x86_64" | "lp32" | "lp64"
Each line in SYSCALLS.TXT describes one system call with its function name, optional aliases, parameter types, and the architectures on which it should be generated. The format supports several important patterns:
Direct system call mapping:
Renamed system calls (where the C name differs from the kernel name):
The __close:close syntax means "generate a function named __close that
invokes the kernel's close system call." The actual close() function that
applications call is a C wrapper in bionic/libc/bionic/ that performs
additional work (like FORTIFY checks or fdsan validation) before calling
__close.
Architecture-conditional system calls:
On 32-bit platforms (lp32), the getuid function calls the kernel's
getuid32 system call (because the original getuid uses 16-bit UIDs). On
64-bit platforms (lp64), it calls getuid directly.
Aliased functions:
The pipe symbol creates multiple symbol aliases that share the same
implementation. On 64-bit systems, lseek and lseek64 are identical because
off_t is 64-bit.
x86 socketcall multiplexing:
__socket:socketcall:1(int, int, int) x86
__connect:socketcall:3(int, struct sockaddr*, socklen_t) x86
On 32-bit x86, socket operations are multiplexed through a single socketcall
system call, with a numeric sub-command. Bionic's generator handles this
automatically.
7.2.3 System Call Stub Generation¶
The gensyscalls.py script (bionic/libc/tools/gensyscalls.py) reads
SYSCALLS.TXT and generates architecture-specific assembly stubs. The supported
architectures are:
ARM 32-bit stub (4 or fewer register arguments):
ENTRY(%(func)s)
mov ip, r7
.cfi_register r7, ip
ldr r7, =%(NR_name)s
swi #0
mov r7, ip
.cfi_restore r7
cmn r0, #(MAX_ERRNO + 1)
bxls lr
neg r0, r0
b __set_errno_internal
END(%(func)s)
On ARM, the system call number goes in register r7, and the SWI (Software Interrupt) instruction traps into the kernel. The stub saves and restores r7 (which is the frame pointer in Thumb mode) to avoid corrupting the call stack.
AArch64 syscall function:
From bionic/libc/arch-arm64/bionic/syscall.S (lines 31-49):
ENTRY(syscall)
/* Move syscall No. from x0 to x8 */
mov x8, x0
/* Move syscall parameters from x1 thru x6 to x0 thru x5 */
mov x0, x1
mov x1, x2
mov x2, x3
mov x3, x4
mov x4, x5
mov x5, x6
svc #0
/* check if syscall returned successfully */
cmn x0, #(MAX_ERRNO + 1)
cneg x0, x0, hi
b.hi __set_errno_internal
ret
END(syscall)
This is the generic syscall() function for AArch64. The system call number
goes in x8, and up to six arguments go in x0-x5. The SVC #0 instruction
enters the kernel. On return, if x0 contains a value in the range
[-MAX_ERRNO, -1], the error is negated and stored in errno via
__set_errno_internal.
7.2.4 The System Call Catalog¶
SYSCALLS.TXT defines system calls in several categories. Here is a breakdown of the major groups:
Process and identity management:
getuid(), getgid(), geteuid(), getegid()
setuid(), setgid(), setresuid(), setresgid()
getpid(), getppid(), getpgid(), getsid()
kill(), tgkill()
execve(), clone(), _exit()
File descriptors:
read(), write(), pread64(), pwrite64()
__close:close(), __openat:openat()
__fcntl64:fcntl64() (lp32), __fcntl:fcntl() (lp64)
__dup:dup(), __dup3:dup3()
Memory management:
__mmap2:mmap2() (lp32), mmap|mmap64() (lp64)
munmap(), mprotect(), madvise(), mremap()
__brk:brk(), mseal() (lp64 only)
File system:
Networking (per-architecture):
Signals:
__rt_sigaction:rt_sigaction()
__rt_sigprocmask:rt_sigprocmask()
__rt_sigsuspend:rt_sigsuspend()
__signalfd4:signalfd4()
Architecture-specific:
__set_tls:__ARM_NR_set_tls(void*) arm
cacheflush:__ARM_NR_cacheflush(long, long, long) arm
__riscv_flush_icache:riscv_flush_icache(void*, void*, unsigned long) riscv64
__set_thread_area:set_thread_area(void*) x86
arch_prctl(int, unsigned long) x86_64
VDSO-accelerated calls:
__clock_getres:clock_getres(clockid_t, struct timespec*) all
__clock_gettime:clock_gettime(clockid_t, struct timespec*) all
__gettimeofday:gettimeofday(struct timeval*, struct timezone*) all
These three system calls are typically handled by the VDSO (Virtual Dynamic Shared Object), which the kernel maps into every process. The VDSO contains user-space implementations of these calls that read from kernel-managed shared memory pages, avoiding the overhead of a full kernel transition. Bionic's dynamic linker explicitly loads the VDSO (see Section 6.3).
7.2.5 LP32 vs. LP64 Differences¶
The system call interface differs significantly between 32-bit and 64-bit platforms:
graph LR
subgraph "LP32 (32-bit)"
A1["off_t = 32 bits<br/>uid_t = 16 bits (historical)"]
A2["getuid:getuid32()"]
A3["lseek() + __llseek()"]
A4["__mmap2:mmap2()"]
A5["fstat64()"]
A6["prlimit64()"]
A7["*_time64() variants"]
end
subgraph "LP64 (64-bit)"
B1["off_t = 64 bits<br/>uid_t = 32 bits"]
B2["getuid()"]
B3["lseek|lseek64()"]
B4["mmap|mmap64()"]
B5["fstat64|fstat()"]
B6["prlimit64|prlimit()"]
B7["Standard time calls"]
end
style A1 fill:#fff3e0
style B1 fill:#e1f5fe
On 32-bit systems, many system calls have 64 suffixes or use register pairs
for 64-bit arguments. The SYSCALLS.TXT generator handles the ABI requirements
automatically, including ARM's constraint that 64-bit argument pairs must start
on an even-numbered register.
The time64 variants (lines 76-91 of SECCOMP_ALLOWLIST_COMMON.TXT) are
particularly notable:
clock_gettime64(clockid_t, timespec64*) lp32
clock_settime64(clockid_t, const timespec64*) lp32
futex_time64(int*, int, int, const timespec64*, int*, int) lp32
These were added for the Y2038 problem: 32-bit time_t overflows in January
- The
*_time64system calls use 64-bit time structures even on 32-bit platforms.
7.2.6 Seccomp-BPF: System Call Filtering¶
Android restricts which system calls are available to application processes using seccomp-BPF (Secure Computing with Berkeley Packet Filter). This is a critical security boundary: even if an attacker achieves arbitrary code execution within an app process, they cannot invoke dangerous system calls that the seccomp filter blocks.
The seccomp policy is built from multiple text files:
| File | Purpose |
|---|---|
SYSCALLS.TXT |
Base set of system calls bionic needs |
SECCOMP_ALLOWLIST_COMMON.TXT |
Additional allowed calls (all processes) |
SECCOMP_ALLOWLIST_APP.TXT |
Additional allowed calls (app processes only) |
SECCOMP_ALLOWLIST_SYSTEM.TXT |
Additional allowed calls (system server only) |
SECCOMP_BLOCKLIST_APP.TXT |
Calls removed from apps even if in SYSCALLS.TXT |
SECCOMP_BLOCKLIST_COMMON.TXT |
Calls removed from all Zygote children |
SECCOMP_PRIORITY.TXT |
Syscalls to check first (hot path optimization) |
The formula for the final policy:
From bionic/libc/SECCOMP_BLOCKLIST_APP.TXT (lines 1-7):
# The final seccomp allowlist is SYSCALLS.TXT - SECCOMP_BLOCKLIST.TXT
# + SECCOMP_ALLOWLIST.TXT
# Any entry in the blocklist must be in the syscalls file and not be in
# the allowlist file
Blocked system calls for apps:
The SECCOMP_BLOCKLIST_APP.TXT file (51 lines) removes dangerous system calls
from app processes:
# Syscalls to modify IDs.
setgid32(gid_t) lp32
setgid(gid_t) lp64
setuid32(uid_t) lp32
setuid(uid_t) lp64
# Syscalls to modify times.
adjtimex(struct timex*) all
clock_adjtime(clockid_t, struct timex*) all
clock_settime(clockid_t, const struct timespec*) all
settimeofday(const struct timeval*, const struct timezone*) all
# Dangerous operations
chroot(const char*) all
init_module(void*, unsigned long, const char*) all
delete_module(const char*, unsigned int) all
mount(const char*, const char*, const char*, unsigned long, const void*) all
reboot(int, int, int, void*) all
These are system calls that exist in SYSCALLS.TXT (because system daemons need them) but are too dangerous for unprivileged app processes.
The common blocklist (SECCOMP_BLOCKLIST_COMMON.TXT) adds:
The app allowlist (SECCOMP_ALLOWLIST_APP.TXT, 62 lines) re-enables
specific calls that apps need but are not in the base SYSCALLS.TXT set, often
for backward compatibility:
# Needed for debugging 32-bit Chrome
pipe(int pipefd[2]) lp32
# b/34813887
open(const char *path, int oflag, ... ) lp32,x86_64
# Not used by bionic in U because riscv64 doesn't have it, but still
# used by legacy apps (http://b/254179267).
renameat(int, const char*, int, const char*) arm,x86,arm64,x86_64
Each entry references an Android bug tracker ID, documenting why the exception exists.
Priority optimization:
From bionic/libc/SECCOMP_PRIORITY.TXT (lines 9-10):
These two system calls are checked first in the BPF filter. Since futex and
ioctl are the most frequently invoked system calls in a typical Android
process (futex for mutex/condvar operations, ioctl for Binder IPC), checking
them first minimizes the average number of BPF instructions executed per system
call.
7.2.7 Seccomp Policy Installation¶
The seccomp filter is installed by the Zygote process before it forks
application processes. The implementation is in
bionic/libc/seccomp/seccomp_policy.cpp.
From bionic/libc/seccomp/seccomp_policy.cpp (lines 33-94):
#if defined __arm__ || defined __aarch64__
#define PRIMARY_ARCH AUDIT_ARCH_AARCH64
static const struct sock_filter* primary_app_filter = arm64_app_filter;
// ...
#define SECONDARY_ARCH AUDIT_ARCH_ARM
static const struct sock_filter* secondary_app_filter = arm_app_filter;
// ...
#elif defined __i386__ || defined __x86_64__
#define PRIMARY_ARCH AUDIT_ARCH_X86_64
// ...
#define SECONDARY_ARCH AUDIT_ARCH_I386
// ...
#elif defined(__riscv)
#define PRIMARY_ARCH AUDIT_ARCH_RISCV64
// ...
#endif
The filter handles dual-architecture systems (e.g., a 64-bit kernel running 32-bit apps) by checking the architecture field in the seccomp data structure and jumping to the appropriate filter:
From bionic/libc/seccomp/seccomp_policy.cpp (lines 128-141):
static size_t ValidateArchitectureAndJumpIfNeeded(filter& f) {
f.push_back(BPF_STMT(BPF_LD|BPF_W|BPF_ABS, arch_nr));
f.push_back(BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, PRIMARY_ARCH, 2, 0));
f.push_back(BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, SECONDARY_ARCH, 1, 0));
Disallow(f);
return f.size() - 2;
}
The BPF program structure:
graph TD
A[System Call Entry] --> B{Check Architecture}
B -->|Primary 64-bit| C{Check Priority Syscalls}
B -->|Secondary 32-bit| D{Check Priority Syscalls 32-bit}
B -->|Unknown| E[SECCOMP_RET_TRAP]
C -->|futex| F[SECCOMP_RET_ALLOW]
C -->|ioctl| F
C -->|Other| G{Check Allowlist}
G -->|In allowlist| F
G -->|Not in allowlist| H{Check UID/GID Filter}
H -->|setresuid in range| F
H -->|Out of range| E
D -->|In 32-bit allowlist| F
D -->|Not allowed| E
style E fill:#ffcdd2
style F fill:#c8e6c9
Three separate filter profiles are generated:
- App filter -- For regular application processes
- App Zygote filter -- For app zygote processes (used by isolated services)
- System filter -- For system server and privileged daemons
The filters are compiled from C structures into BPF bytecode and installed
using prctl(PR_SET_SECCOMP):
From bionic/libc/seccomp/seccomp_policy.cpp (lines 193-199):
static bool install_filter(filter const& f) {
struct sock_fprog prog = {
static_cast<unsigned short>(f.size()),
const_cast<struct sock_filter*>(&f[0]),
};
if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog) < 0) {
The SECCOMP_RET_TRAP action sends a SIGSYS signal to the process, which
Android's debuggerd captures for crash reporting. This produces a clear
crash report that identifies the forbidden system call, aiding debugging.
7.2.8 VDSO: Avoiding System Call Overhead¶
For the most performance-sensitive system calls, the kernel provides a Virtual Dynamic Shared Object (VDSO) -- a tiny shared library mapped by the kernel into every process's address space. Bionic's dynamic linker explicitly locates and links the VDSO.
From bionic/linker/linker_main.cpp (lines 184-205):
static void add_vdso() {
ElfW(Ehdr)* ehdr_vdso = reinterpret_cast<ElfW(Ehdr)*>(
getauxval(AT_SYSINFO_EHDR));
if (ehdr_vdso == nullptr) {
return;
}
vdso = soinfo_alloc(&g_default_namespace, "[vdso]", nullptr, 0, 0);
vdso->phdr = reinterpret_cast<ElfW(Phdr)*>(
reinterpret_cast<char*>(ehdr_vdso) + ehdr_vdso->e_phoff);
vdso->phnum = ehdr_vdso->e_phnum;
vdso->base = reinterpret_cast<ElfW(Addr)>(ehdr_vdso);
vdso->size = phdr_table_get_load_size(vdso->phdr, vdso->phnum);
vdso->load_bias = get_elf_exec_load_bias(ehdr_vdso);
if (!vdso->prelink_image() ||
!vdso->link_image(SymbolLookupList(vdso), vdso, nullptr, nullptr)) {
__linker_cannot_link(g_argv[0]);
}
// Prevent accidental unloads...
vdso->set_dt_flags_1(vdso->get_dt_flags_1() | DF_1_NODELETE);
vdso->set_linked();
}
The VDSO is located via the AT_SYSINFO_EHDR auxiliary vector entry, which
the kernel places on the process stack at exec time. The linker treats the
VDSO like any other shared library -- creating a soinfo structure, running
the prelink and link phases -- but the VDSO's code runs entirely in user space,
reading kernel-maintained data structures to answer queries like "what time is
it?" without a mode switch.
VDSO-accelerated calls in Bionic:
clock_gettime()-- The single most frequently called time functionclock_getres()-- Clock resolution querygettimeofday()-- Legacy time-of-day query
7.3 The Dynamic Linker¶
7.3.1 Overview¶
The dynamic linker (/system/bin/linker64 on 64-bit devices, /system/bin/linker
on 32-bit) is responsible for loading every dynamically-linked executable and
shared library on Android. It is the first user-space code to execute after the
kernel maps a new process, and its correct operation is essential for every
native binary on the system.
The linker source lives in bionic/linker/ and comprises approximately 50
source files totaling over 7,000 lines of C++. The key files are:
| File | Lines | Purpose |
|---|---|---|
linker.cpp |
3,791 | Core linking logic: library search, loading, namespace management |
linker_phdr.cpp |
1,737 | ELF parsing, segment loading, address space management |
linker_main.cpp |
859 | Entry point, initialization, main linking sequence |
linker_relocate.cpp |
686 | Relocation processing |
linker_namespaces.h |
183 | Namespace data structures |
linker_soinfo.h |
~400 | soinfo structure definition |
linker_config.cpp |
~500 | Configuration file parser |
dlfcn.cpp |
~100 | dlopen/dlsym API surface |
7.3.2 The Linker Entry Point¶
When the kernel executes a dynamically-linked ELF binary, it:
- Maps the executable's PT_LOAD segments
- Reads the PT_INTERP segment to find the linker path (e.g.,
/system/bin/linker64) - Maps the linker into the process
- Sets up the auxiliary vector (AT_PHDR, AT_ENTRY, AT_BASE, etc.)
- Transfers control to the linker's entry point
The linker's entry point is _start (in architecture-specific assembly), which
calls __linker_init. This function faces a bootstrapping problem: the linker
itself is a dynamically-linked binary that needs to be relocated before it can
relocate anything else.
The solution is a two-phase initialization:
- Self-relocation -- Process the linker's own relocations using only position-independent code (no external symbol references)
- Main link -- Load and link the executable and all its dependencies
7.3.3 The Main Linking Sequence¶
The linker_main function in bionic/linker/linker_main.cpp orchestrates the
entire linking process.
From bionic/linker/linker_main.cpp (lines 297-525):
static ElfW(Addr) linker_main(KernelArgumentBlock& args,
const char* exe_to_load) {
ProtectedDataGuard guard;
// Sanitize the environment.
__libc_init_AT_SECURE(args.envp);
// Initialize system properties
__system_properties_init();
// Initialize platform properties.
platform_properties_init();
// Register the debuggerd signal handler.
linker_debuggerd_init();
The function proceeds through these phases:
graph TD
A["__linker_init<br/>(Self-relocate)"] --> B["linker_main()"]
B --> C["Sanitize environment<br/>(AT_SECURE check)"]
C --> D["Init system properties"]
D --> E["Init platform properties<br/>(BTI support on ARM64)"]
E --> F["Register debuggerd handler"]
F --> G["Parse LD_DEBUG,<br/>LD_LIBRARY_PATH, LD_PRELOAD"]
G --> H["Load/locate executable"]
H --> I["Create soinfo for executable"]
I --> J["Init linker config + namespaces"]
J --> K["Prelink executable<br/>(parse .dynamic section)"]
K --> L["Load DT_NEEDED + LD_PRELOAD<br/>(BFS dependency walk)"]
L --> M["Relocate all libraries"]
M --> N["Init VDSO"]
N --> O["Finalize static TLS"]
O --> P["Init CFI shadow"]
P --> Q["Call .preinit_array"]
Q --> R["Call .init_array for all libs"]
R --> S["Return executable entry point"]
style A fill:#fff3e0
style H fill:#e8f5e9
style L fill:#e1f5fe
style M fill:#f3e5f5
style R fill:#fce4ec
style S fill:#c8e6c9
Phase 1: Environment and Security
// These should have been sanitized by __libc_init_AT_SECURE, but the
// test doesn't cost us anything.
const char* ldpath_env = nullptr;
const char* ldpreload_env = nullptr;
if (!getauxval(AT_SECURE)) {
ldpath_env = getenv("LD_LIBRARY_PATH");
ldpreload_env = getenv("LD_PRELOAD");
}
When AT_SECURE is set (the executable is setuid/setgid), LD_LIBRARY_PATH
and LD_PRELOAD are ignored. This prevents privilege escalation attacks where a
user sets these variables to inject malicious libraries into a privileged
process.
Phase 2: Executable Initialization
From bionic/linker/linker_main.cpp (lines 340-358):
const ExecutableInfo exe_info = exe_to_load ?
load_executable(exe_to_load) :
get_executable_info(args.argv[0]);
soinfo* si = soinfo_alloc(&g_default_namespace,
exe_info.path.c_str(), &exe_info.file_stat,
0, RTLD_GLOBAL);
somain = si;
si->phdr = exe_info.phdr;
si->phnum = exe_info.phdr_count;
si->set_should_pad_segments(exe_info.should_pad_segments);
get_elf_base_from_phdr(si->phdr, si->phnum, &si->base, &si->load_bias);
si->size = phdr_table_get_load_size(si->phdr, si->phnum);
si->dynamic = nullptr;
si->set_main_executable();
init_link_map_head(*si);
set_bss_vma_name(si);
The get_executable_info function reads the executable's program headers from
the auxiliary vector (AT_PHDR, AT_PHNUM, AT_ENTRY). The kernel has
already mapped the executable, so the linker just needs to find the headers.
The soinfo structure is the linker's per-library metadata. It is allocated
from a custom block allocator (LinkerTypeAllocator<soinfo>) that maps memory
in page-sized blocks, enabling write-protection via ProtectedDataGuard.
Phase 3: Namespace Initialization and Dependency Loading
std::vector<android_namespace_t*> namespaces =
init_default_namespaces(exe_info.path.c_str());
if (!si->prelink_image()) __linker_cannot_link(g_argv[0]);
// Load ld_preloads and dependencies.
for (const ElfW(Dyn)* d = si->dynamic; d->d_tag != DT_NULL; ++d) {
if (d->d_tag == DT_NEEDED) {
const char* name = fix_dt_needed(
si->get_string(d->d_un.d_val), si->get_realpath());
needed_library_name_list.push_back(name);
}
}
if (!find_libraries(&g_default_namespace, si,
needed_library_names, needed_libraries_count,
nullptr, &g_ld_preloads, ld_preloads_count,
RTLD_GLOBAL, nullptr,
true /* add_as_children */, &namespaces)) {
__linker_cannot_link(g_argv[0]);
}
The prelink_image method parses the .dynamic section to extract symbol
tables, relocation tables, DT_NEEDED entries, and initialization/finalization
functions. The find_libraries function then performs a breadth-first
dependency walk, loading each library and adding it to the appropriate namespace.
Phase 4: Constructor Invocation and Handoff
si->call_pre_init_constructors();
si->call_constructors();
ElfW(Addr) entry = exe_info.entry_point;
return entry;
After all libraries are loaded and relocated, the linker calls initialization functions in dependency order (leaves first, roots last). It then returns the executable's entry point address, and control transfers to the application.
7.3.4 The soinfo Structure¶
The soinfo structure is the linker's representation of a loaded shared
library. Every library -- including the executable itself, the linker, and the
VDSO -- has one.
From bionic/linker/linker_soinfo.h (lines 157-248):
struct soinfo {
const ElfW(Phdr)* phdr;
size_t phnum;
ElfW(Addr) base;
size_t size;
ElfW(Dyn)* dynamic;
soinfo* next;
private:
uint32_t flags_;
const char* strtab_;
ElfW(Sym)* symtab_;
size_t nbucket_;
size_t nchain_;
uint32_t* bucket_;
uint32_t* chain_;
#if defined(USE_RELA)
ElfW(Rela)* plt_rela_;
size_t plt_rela_count_;
ElfW(Rela)* rela_;
size_t rela_count_;
#else
ElfW(Rel)* plt_rel_;
size_t plt_rel_count_;
ElfW(Rel)* rel_;
size_t rel_count_;
#endif
linker_ctor_function_t* preinit_array_;
size_t preinit_array_count_;
linker_ctor_function_t* init_array_;
size_t init_array_count_;
linker_dtor_function_t* fini_array_;
size_t fini_array_count_;
linker_ctor_function_t init_func_;
linker_dtor_function_t fini_func_;
#if defined(__arm__)
uint32_t* ARM_exidx;
size_t ARM_exidx_count;
#endif
link_map link_map_head;
bool constructors_called;
ElfW(Addr) load_bias;
bool has_DT_SYMBOLIC;
};
Key flags in the flags_ field:
| Flag | Value | Meaning |
|---|---|---|
FLAG_LINKED |
0x00000001 | Library is fully linked |
FLAG_EXE |
0x00000004 | This is the main executable |
FLAG_LINKER |
0x00000010 | This is the linker itself |
FLAG_GNU_HASH |
0x00000040 | Uses GNU hash table |
FLAG_MAPPED_BY_CALLER |
0x00000080 | Memory was provided externally |
FLAG_IMAGE_LINKED |
0x00000100 | link_image has run |
FLAG_PRELINKED |
0x00000400 | prelink_image has run |
FLAG_GLOBALS_TAGGED |
0x00000800 | MTE globals tagged |
The soinfo structures form a singly-linked list via the next pointer,
maintained by solist_add_soinfo and solist_remove_soinfo. The list order
is:
- The main executable (
somain) - The linker itself (
solinker) - The VDSO (if present)
- All other libraries in load order
7.3.5 ELF Loading: The ElfReader Class¶
The ElfReader class in bionic/linker/linker_phdr.cpp handles the mechanics
of reading and mapping ELF files into memory.
Reading an ELF file:
From bionic/linker/linker_phdr.cpp (lines 171-208):
bool ElfReader::Read(const char* name, int fd, off64_t file_offset,
off64_t file_size) {
if (did_read_) {
return true;
}
name_ = name;
fd_ = fd;
file_offset_ = file_offset;
file_size_ = file_size;
if (ReadElfHeader() &&
VerifyElfHeader() &&
ReadProgramHeaders() &&
CheckProgramHeaderAlignment() &&
ReadSectionHeaders() &&
ReadDynamicSection() &&
ReadPadSegmentNote()) {
did_read_ = true;
}
// ...
return did_read_;
}
The Read phase performs validation and reads metadata:
graph TD
A["ReadElfHeader()"] --> B["VerifyElfHeader()"]
B --> C["ReadProgramHeaders()"]
C --> D["CheckProgramHeaderAlignment()"]
D --> E["ReadSectionHeaders()"]
E --> F["ReadDynamicSection()"]
F --> G["ReadPadSegmentNote()"]
G --> H["16KiB compat check"]
B -->|"Bad magic"| X["DL_ERR: bad ELF magic"]
B -->|"Wrong class"| Y["DL_ERR: 32-bit vs 64-bit"]
B -->|"Wrong machine"| Z["DL_ERR: wrong architecture"]
style X fill:#ffcdd2
style Y fill:#ffcdd2
style Z fill:#ffcdd2
ELF header verification:
From bionic/linker/linker_phdr.cpp (lines 271-340):
bool ElfReader::VerifyElfHeader() {
if (memcmp(header_.e_ident, ELFMAG, SELFMAG) != 0) {
DL_ERR("\"%s\" has bad ELF magic", name_.c_str());
return false;
}
int elf_class = header_.e_ident[EI_CLASS];
#if defined(__LP64__)
if (elf_class != ELFCLASS64) {
if (elf_class == ELFCLASS32) {
DL_ERR("\"%s\" is 32-bit instead of 64-bit", name_.c_str());
}
return false;
}
#endif
if (header_.e_type != ET_DYN) {
DL_ERR("\"%s\" has unexpected e_type: %d", name_.c_str(), header_.e_type);
return false;
}
if (header_.e_machine != GetTargetElfMachine()) {
DL_ERR("\"%s\" is for %s instead of %s",
name_.c_str(),
EM_to_string(header_.e_machine),
EM_to_string(GetTargetElfMachine()));
return false;
}
return true;
}
The GetTargetElfMachine() function returns the expected ELF machine type
based on the compile-time architecture:
static int GetTargetElfMachine() {
#if defined(__arm__)
return EM_ARM;
#elif defined(__aarch64__)
return EM_AARCH64;
#elif defined(__i386__)
return EM_386;
#elif defined(__riscv)
return EM_RISCV;
#elif defined(__x86_64__)
return EM_X86_64;
#endif
}
Note that the linker requires e_type == ET_DYN. This means Android only loads
Position-Independent Executables (PIE). Non-PIE support was dropped in API level
21 for security (ASLR effectiveness):
if (elf_hdr->e_type != ET_DYN) {
__linker_error("error: Android only supports position-independent "
"executables (-fPIE)");
}
Loading segments into memory:
From bionic/linker/linker_phdr.cpp (lines 211-238):
bool ElfReader::Load(address_space_params* address_space) {
CHECK(did_read_);
if (did_load_) {
return true;
}
bool reserveSuccess = ReserveAddressSpace(address_space);
if (reserveSuccess && LoadSegments() && FindPhdr() &&
FindGnuPropertySection()) {
did_load_ = true;
#if defined(__aarch64__)
if (note_gnu_property_.IsBTICompatible()) {
did_load_ =
(phdr_table_protect_segments(phdr_table_, phdr_num_, load_bias_,
should_pad_segments_, should_use_16kib_app_compat_,
¬e_gnu_property_) == 0);
}
#endif
}
return did_load_;
}
The Load phase:
- ReserveAddressSpace -- Allocates a contiguous virtual address range for
all PT_LOAD segments via
mmap(PROT_NONE). - LoadSegments -- Maps each PT_LOAD segment from the file into the reserved range with appropriate permissions.
- FindPhdr -- Locates the program header table within the mapped image.
- FindGnuPropertySection -- Reads
.note.gnu.propertyfor BTI (Branch Target Identification) compatibility on AArch64. - BTI protection -- If the library is BTI-compatible, applies
PROT_BTIto executable segments.
Address space reservation with ASLR enhancement:
From bionic/linker/linker_phdr.cpp (lines 589-662):
// Reserve a virtual address range such that if its limits were extended
// to the next 2**align boundary, it would not overlap with any existing
// mappings.
static void* ReserveWithAlignmentPadding(size_t size, size_t mapping_align,
size_t start_align,
void** out_gap_start,
size_t* out_gap_size) {
// ...
#if defined(__LP64__)
size_t first_byte = reinterpret_cast<size_t>(
__builtin_align_up(mmap_ptr, mapping_align));
size_t last_byte = reinterpret_cast<size_t>(
__builtin_align_down(mmap_ptr + mmap_size, mapping_align) - 1);
if (first_byte / kGapAlignment != last_byte / kGapAlignment) {
// This library crosses a 2MB boundary and will fragment a new huge
// page. Insert random inaccessible huge pages before to improve
// ASLR.
gap_size = kGapAlignment * (is_first_stage_init() ? 1 :
arc4random_uniform(kMaxGapUnits - 1) + 1);
}
#endif
This code implements an ASLR enhancement: when a library's mapping crosses a 2MB (PMD-sized) boundary, the linker inserts a random number of inaccessible 2MB pages before the library. This makes it harder for attackers to locate library code by probing for readable memory mappings. The gap size is random (1 to 32 units of 2MB = 2-64MB) and varies per library load.
7.3.6 The Load Bias and Virtual Address Calculation¶
A central concept in ELF loading is the load bias:
From the documentation comment in bionic/linker/linker_phdr.cpp (lines 74-149):
An ELF file's program header table contains one or more PT_LOAD
segments, which corresponds to portions of the file that need to
be mapped into the process' address space.
Each loadable segment has the following important properties:
p_offset -> segment file offset
p_filesz -> segment file size
p_memsz -> segment memory size (always >= p_filesz)
p_vaddr -> segment's virtual address
p_flags -> segment flags (e.g. readable, writable, executable)
p_align -> segment's alignment
The load_bias must be added to any p_vaddr value read from the ELF
file to determine the corresponding memory address.
load_bias = phdr0_load_address - page_start(phdr0->p_vaddr)
The load bias is the difference between where the first segment was actually mapped and where it "wanted" to be (its p_vaddr). Since all segments maintain their relative positions, adding the load bias to any p_vaddr gives the actual memory address:
The calculation:
From bionic/linker/linker_phdr.cpp (lines 516-553):
size_t phdr_table_get_load_size(const ElfW(Phdr)* phdr_table,
size_t phdr_count,
ElfW(Addr)* out_min_vaddr,
ElfW(Addr)* out_max_vaddr) {
ElfW(Addr) min_vaddr = UINTPTR_MAX;
ElfW(Addr) max_vaddr = 0;
for (size_t i = 0; i < phdr_count; ++i) {
const ElfW(Phdr)* phdr = &phdr_table[i];
if (phdr->p_type != PT_LOAD) {
continue;
}
if (phdr->p_vaddr < min_vaddr) {
min_vaddr = phdr->p_vaddr;
}
if (phdr->p_vaddr + phdr->p_memsz > max_vaddr) {
max_vaddr = phdr->p_vaddr + phdr->p_memsz;
}
}
min_vaddr = page_start(min_vaddr);
max_vaddr = page_end(max_vaddr);
return max_vaddr - min_vaddr;
}
7.3.7 16KiB Page Size Compatibility¶
Android is transitioning from 4KiB to 16KiB page sizes. The linker includes compatibility logic for loading 4KiB-aligned libraries on 16KiB-page devices:
From bionic/linker/linker_phdr.cpp (lines 190-206):
if (kPageSize == 16 * 1024 && min_align_ < kPageSize) {
auto compat_prop_val =
::android::base::GetProperty(
"bionic.linker.16kb.app_compat.enabled", "false");
should_use_16kib_app_compat_ =
ParseBool(compat_prop_val) == ParseBoolResult::kTrue ||
get_16kb_appcompat_mode();
}
In compatibility mode, the linker reads ELF segments into a writable
reservation rather than using mmap() directly, because mmap() requires
mappings aligned to the system page size (16KiB), but the library's segments
may be aligned to only 4KiB.
This is controlled by the system property
bionic.linker.16kb.app_compat.enabled and an ELF note
(NT_ANDROID_TYPE_PAD_SEGMENT) that indicates the library supports
segment padding for page size migration.
7.3.8 Relocation Processing¶
After all segments are mapped, the linker must process relocations -- patches to code and data that encode references to symbols whose addresses are not known until load time.
The relocation engine is in bionic/linker/linker_relocate.cpp.
From bionic/linker/linker_relocate.cpp (lines 63-95):
class Relocator {
public:
Relocator(const VersionTracker& version_tracker,
const SymbolLookupList& lookup_list)
: version_tracker(version_tracker), lookup_list(lookup_list)
{}
soinfo* si = nullptr;
const char* si_strtab = nullptr;
size_t si_strtab_size = 0;
ElfW(Sym)* si_symtab = nullptr;
const VersionTracker& version_tracker;
const SymbolLookupList& lookup_list;
// Cache key/value for repeated symbol lookups
ElfW(Word) cache_sym_val = 0;
const ElfW(Sym)* cache_sym = nullptr;
soinfo* cache_si = nullptr;
// ...
};
The Relocator class maintains state for processing a library's relocations.
The symbol cache (lines 78-81) is a critical optimization: many relocations in
a library reference the same symbol, and the cache avoids repeated hash table
lookups.
Relocation modes:
From bionic/linker/linker_relocate.cpp (lines 132-139):
enum class RelocMode {
// Fast path for JUMP_SLOT relocations.
JumpTable,
// Fast path for typical relocations: ABSOLUTE, GLOB_DAT, or RELATIVE.
Typical,
// Handle all relocation types, including text sections and statistics.
General,
};
The linker uses template specialization on RelocMode to generate three
versions of the relocation loop. The JumpTable and Typical modes are
optimized fast paths that handle the vast majority of relocations. The
General mode handles rare cases like TLS relocations, text relocations
(32-bit only), and IFUNCs.
Processing a single relocation:
From bionic/linker/linker_relocate.cpp (lines 163-176):
template <RelocMode Mode>
static bool process_relocation_impl(Relocator& relocator,
const rel_t& reloc) {
void* const rel_target = reinterpret_cast<void*>(
relocator.si->apply_memtag_if_mte_globals(
reloc.r_offset + relocator.si->load_bias));
const uint32_t r_type = ELFW(R_TYPE)(reloc.r_info);
const uint32_t r_sym = ELFW(R_SYM)(reloc.r_info);
soinfo* found_in = nullptr;
const ElfW(Sym)* sym = nullptr;
const char* sym_name = nullptr;
ElfW(Addr) sym_addr = 0;
if (r_sym != 0) {
sym_name = relocator.get_string(
relocator.si_symtab[r_sym].st_name);
}
For each relocation entry, the linker:
- Computes the target address (offset + load_bias)
- Extracts the relocation type and symbol index
- Looks up the symbol name in the string table
- Resolves the symbol to an address
- Applies the relocation (writes the resolved address to the target)
Symbol lookup with caching:
From bionic/linker/linker_relocate.cpp (lines 100-130):
static inline bool lookup_symbol(Relocator& relocator, uint32_t r_sym,
const char* sym_name,
soinfo** found_in,
const ElfW(Sym)** sym) {
if (r_sym == relocator.cache_sym_val) {
*found_in = relocator.cache_si;
*sym = relocator.cache_sym;
count_relocation_if<DoLogging>(kRelocSymbolCached);
} else {
const version_info* vi = nullptr;
if (!relocator.si->lookup_version_info(
relocator.version_tracker, r_sym, sym_name, &vi)) {
return false;
}
soinfo* local_found_in = nullptr;
const ElfW(Sym)* local_sym = soinfo_do_lookup(
sym_name, vi, &local_found_in, relocator.lookup_list);
relocator.cache_sym_val = r_sym;
relocator.cache_si = local_found_in;
relocator.cache_sym = local_sym;
*found_in = local_found_in;
*sym = local_sym;
}
if (*sym == nullptr) {
if (ELF_ST_BIND(relocator.si_symtab[r_sym].st_info) != STB_WEAK) {
DL_ERR("cannot locate symbol \"%s\" referenced by \"%s\"",
sym_name, relocator.si->get_realpath());
return false;
}
}
return true;
}
The lookup uses version information (ELF symbol versioning) when available, which allows libraries to export multiple versions of the same symbol. This is how libc can evolve its API without breaking backward compatibility.
Relocation statistics:
void print_linker_stats() {
LD_DEBUG(statistics,
"RELO STATS: %s: %d abs, %d rel, %d symbol (%d cached)",
g_argv[0],
linker_stats.count[kRelocAbsolute],
linker_stats.count[kRelocRelative],
linker_stats.count[kRelocSymbol],
linker_stats.count[kRelocSymbolCached]);
}
These statistics, enabled via LD_DEBUG=statistics, reveal the relocation
workload. A typical Android app might process tens of thousands of relocations
during startup. The symbol cache typically achieves hit rates above 80%,
significantly reducing startup time.
7.3.9 Symbol Resolution¶
Symbol resolution is the process of finding the definition of a symbol given its name. The linker supports two hash table formats:
- ELF hash (classic
DT_HASH) -- The original ELF hash table - GNU hash (
DT_GNU_HASH) -- A more efficient format that uses a Bloom filter for fast rejection
From bionic/linker/linker_soinfo.h (lines 80-98):
struct SymbolLookupLib {
uint32_t gnu_maskwords_ = 0;
uint32_t gnu_shift2_ = 0;
ElfW(Addr)* gnu_bloom_filter_ = nullptr;
const char* strtab_;
size_t strtab_size_;
const ElfW(Sym)* symtab_;
const ElfW(Versym)* versym_;
const uint32_t* gnu_chain_;
size_t gnu_nbucket_;
uint32_t* gnu_bucket_;
soinfo* si_ = nullptr;
bool needs_sysv_lookup() const {
return si_ != nullptr && gnu_bloom_filter_ == nullptr;
}
};
The SymbolLookupLib structure pre-extracts all the fields needed for symbol
lookup from a library, avoiding repeated pointer chasing during the relocation
loop. The needs_sysv_lookup() method returns true only for libraries that
lack a GNU hash table (increasingly rare).
GNU hash Bloom filter:
The GNU hash table includes a Bloom filter that allows the linker to quickly reject lookups for symbols that definitely do not exist in a library. This is particularly effective because most symbols are defined in only one or two libraries, so the vast majority of lookups in other libraries will be rejected by the Bloom filter without examining the hash chains.
Symbol lookup order:
The SymbolLookupList class defines the order in which libraries are searched:
class SymbolLookupList {
std::vector<SymbolLookupLib> libs_;
SymbolLookupLib sole_lib_;
const SymbolLookupLib* begin_;
const SymbolLookupLib* end_;
size_t slow_path_count_ = 0;
// ...
};
For a library with DT_SYMBOLIC, its own symbol table is searched first.
Otherwise, the order follows the standard ELF rules: global scope first (all
libraries loaded with RTLD_GLOBAL), then the local scope (the library and its
dependencies).
7.3.10 Library Search and Loading¶
When the linker needs to load a library (either from DT_NEEDED or dlopen), it searches multiple locations in a defined order.
From bionic/linker/linker.cpp (lines 1051-1082):
static int open_library(android_namespace_t* ns,
ZipArchiveCache* zip_archive_cache,
const char* name, soinfo *needed_by,
off64_t* file_offset, std::string* realpath) {
// If the name contains a slash, open directly
if (strchr(name, '/') != nullptr) {
return open_library_at_path(zip_archive_cache, name,
file_offset, realpath);
}
// 1. LD_LIBRARY_PATH has the highest priority
int fd = open_library_on_paths(zip_archive_cache, name, file_offset,
ns->get_ld_library_paths(), realpath);
// 2. Try the DT_RUNPATH, and verify accessibility
if (fd == -1 && needed_by != nullptr) {
fd = open_library_on_paths(zip_archive_cache, name, file_offset,
needed_by->get_dt_runpath(), realpath);
if (fd != -1 && !ns->is_accessible(*realpath)) {
close(fd);
fd = -1;
}
}
// 3. Search the namespace's default paths
if (fd == -1) {
fd = open_library_on_paths(zip_archive_cache, name, file_offset,
ns->get_default_library_paths(), realpath);
}
return fd;
}
The search order is:
graph TD
A["Library name<br/>(e.g., libfoo.so)"] --> B{Contains '/'?}
B -->|Yes| C["Open directly at path"]
B -->|No| D["Search LD_LIBRARY_PATH"]
D -->|Found| Z["Return fd"]
D -->|Not found| E["Search DT_RUNPATH<br/>(from requesting library)"]
E -->|Found + accessible| Z
E -->|Not found| F["Search namespace<br/>default paths"]
F -->|Found| Z
F -->|Not found| G["Search linked<br/>namespaces"]
G -->|Found + shared| Z
G -->|Not found| H["DL_ERR: library not found"]
style Z fill:#c8e6c9
style H fill:#ffcdd2
Loading from APK files (ZIP):
A unique feature of Android's linker is the ability to load shared libraries
directly from APK files (which are ZIP archives). From bionic/linker/linker.cpp
(lines 927-996):
static int open_library_in_zipfile(ZipArchiveCache* zip_archive_cache,
const char* const input_path,
off64_t* file_offset,
std::string* realpath) {
// Treat an '!/' separator inside a path as the separator between
// the zip file name and the subdirectory to search within it.
const char* const separator = strstr(path, kZipFileSeparator);
// ...
ZipEntry entry;
if (FindEntry(handle, file_path, &entry) != 0) {
close(fd);
return -1;
}
// Check if it is properly stored (not compressed, page-aligned)
if (entry.method != kCompressStored ||
(entry.offset % page_size()) != 0) {
close(fd);
return -1;
}
*file_offset = entry.offset;
return fd;
}
The library must be stored uncompressed and page-aligned within the ZIP file.
The linker opens the APK, finds the entry, and returns a file descriptor with
the offset to the library data. The path syntax uses !/ as a separator:
/data/app/com.example/base.apk!/lib/arm64-v8a/libfoo.so.
7.3.11 Dependency Walking and Load Order¶
The find_libraries function (in linker.cpp) performs a breadth-first walk
of the dependency tree. The BFS order ensures that dependencies are loaded
before the libraries that need them.
From bionic/linker/linker.cpp (lines 703-741):
template<typename F>
static bool walk_dependencies_tree(soinfo* root_soinfo, F action) {
SoinfoLinkedList visit_list;
SoinfoLinkedList visited;
visit_list.push_back(root_soinfo);
soinfo* si;
while ((si = visit_list.pop_front()) != nullptr) {
if (visited.contains(si)) {
continue;
}
walk_action_result_t result = action(si);
if (result == kWalkStop) {
return false;
}
visited.push_back(si);
if (result != kWalkSkip) {
si->get_children().for_each([&](soinfo* child) {
visit_list.push_back(child);
});
}
}
return true;
}
This BFS walker is used for:
- Loading dependencies (
find_libraries) dlsym(RTLD_DEFAULT)global symbol lookupdlsym(handle)handle-based symbol lookup- Constructor invocation ordering
The three possible action results (kWalkStop, kWalkContinue, kWalkSkip)
allow the walker to be used for both search (stop when found) and traversal
(visit everything) operations.
7.3.12 The dlopen/dlsym/dlclose API¶
Applications interact with the linker at runtime through the dl* family of
functions. These are exposed through dlfcn.cpp:
From bionic/linker/dlfcn.cpp (lines 49-99):
extern "C" {
android_namespace_t* __loader_android_create_namespace(
const char* name,
const char* ld_library_path,
const char* default_library_path,
uint64_t type,
const char* permitted_when_isolated_path,
android_namespace_t* parent_namespace,
const void* caller_addr) __LINKER_PUBLIC__;
void* __loader_android_dlopen_ext(
const char* filename,
int flags,
const android_dlextinfo* extinfo,
const void* caller_addr) __LINKER_PUBLIC__;
void* __loader_dlopen(
const char* filename,
int flags,
const void* caller_addr) __LINKER_PUBLIC__;
void* __loader_dlsym(
void* handle,
const char* symbol,
const void* caller_addr) __LINKER_PUBLIC__;
int __loader_dlclose(void* handle) __LINKER_PUBLIC__;
All functions take a caller_addr parameter, which the linker uses to
determine the namespace context. By examining which soinfo contains the
caller's address, the linker determines which namespace the caller belongs to,
and searches that namespace for the requested library.
Android-specific extensions:
android_dlopen_ext provides capabilities beyond standard dlopen:
ANDROID_DLEXT_FORCE_LOAD-- Load even if already loadedANDROID_DLEXT_USE_LIBRARY_FD-- Load from an explicit file descriptorANDROID_DLEXT_RESERVED_ADDRESS-- Load at a specific addressANDROID_DLEXT_USE_NAMESPACE-- Load into a specific namespace
7.3.13 Protected Data and Security¶
The linker protects its internal data structures against corruption:
From bionic/linker/linker.cpp (lines 468-491):
ProtectedDataGuard::ProtectedDataGuard() {
if (ref_count_++ == 0) {
protect_data(PROT_READ | PROT_WRITE);
}
if (ref_count_ == 0) { // overflow
async_safe_fatal("Too many nested calls to dlopen()");
}
}
ProtectedDataGuard::~ProtectedDataGuard() {
if (--ref_count_ == 0) {
protect_data(PROT_READ);
}
}
void ProtectedDataGuard::protect_data(int protection) {
g_soinfo_allocator.protect_all(protection);
g_soinfo_links_allocator.protect_all(protection);
g_namespace_allocator.protect_all(protection);
g_namespace_list_allocator.protect_all(protection);
}
All four allocators (soinfo, soinfo links, namespaces, namespace links) are
protected with read-only memory mappings. A ProtectedDataGuard must be
acquired (via RAII) before modifying any linker data. This is a defense-in-depth
measure: if an attacker corrupts linker data structures, the linker will crash
with a SIGSEGV (access violation) rather than executing attacker-controlled
code.
7.3.14 Linker Configuration¶
The linker reads its configuration from one of several locations:
From bionic/linker/linker.cpp (lines 98-103):
static const char* const kLdConfigArchFilePath =
"/system/etc/ld.config." ABI_STRING ".txt";
static const char* const kLdConfigFilePath =
"/system/etc/ld.config.txt";
static const char* const kLdConfigVndkLiteFilePath =
"/system/etc/ld.config.vndk_lite.txt";
static const char* const kLdGeneratedConfigFilePath =
"/linkerconfig/ld.config.txt";
The preferred source is the generated configuration at /linkerconfig/ld.config.txt,
produced by the linkerconfig tool (see Section 6.4). This file defines
namespaces, their search paths, permitted paths, and inter-namespace links.
The configuration file format uses INI-style sections:
[default]
namespace.default.search.paths = /system/${LIB}
namespace.default.permitted.paths = /system/${LIB}/hw
namespace.default.isolated = true
namespace.default.links = vndk,system
namespace.default.link.vndk.shared_libs = libcutils.so:libbase.so
namespace.default.link.system.shared_libs = libc.so:libm.so:libdl.so
The ConfigParser class in bionic/linker/linker_config.cpp parses this
format, supporting assignment (=), append (+=), and section ([name])
directives.
7.3.15 The Complete ELF Loading Pipeline¶
Here is the complete pipeline from dlopen("libfoo.so") to execution:
graph TD
A["dlopen('libfoo.so', RTLD_NOW)"] --> B["Determine caller namespace"]
B --> C["Search library paths"]
C --> D["Open file descriptor"]
D --> E["Check if already loaded<br/>(by inode or realpath)"]
E -->|Already loaded| F["Increment refcount, return handle"]
E -->|Not loaded| G["ElfReader::Read()"]
G --> G1["ReadElfHeader()"]
G1 --> G2["VerifyElfHeader()"]
G2 --> G3["ReadProgramHeaders()"]
G3 --> G4["ReadSectionHeaders()"]
G4 --> G5["ReadDynamicSection()"]
G5 --> G6["ReadPadSegmentNote()"]
G6 --> H["ElfReader::Load()"]
H --> H1["ReserveAddressSpace()"]
H1 --> H2["LoadSegments()"]
H2 --> H3["FindPhdr()"]
H3 --> H4["FindGnuPropertySection()"]
H4 --> I["Create soinfo"]
I --> J["prelink_image()<br/>(parse .dynamic)"]
J --> K["Load DT_NEEDED<br/>(recursive BFS)"]
K --> L["link_image()<br/>(process relocations)"]
L --> M["call_constructors()<br/>(.init_array)"]
M --> N["Return handle"]
style A fill:#e1f5fe
style N fill:#c8e6c9
7.4 VNDK and Linker Namespaces¶
7.4.1 The Treble Namespace Problem¶
Android's Treble architecture (introduced in Android 8.0) separates the
platform (framework) from the vendor implementation. The goal is to
allow the platform to be updated independently of vendor code. But native
libraries pose a challenge: if a vendor library and a platform library both
link against libutils.so, they might need different versions of it.
The solution is linker namespaces -- the linker's mechanism for isolating different sets of libraries so they cannot see each other's symbols.
7.4.2 The android_namespace_t Structure¶
From bionic/linker/linker_namespaces.h (lines 72-183):
struct android_namespace_t {
const char* get_name() const { return name_.c_str(); }
bool is_isolated() const { return is_isolated_; }
bool is_also_used_as_anonymous() const {
return is_also_used_as_anonymous_;
}
const std::vector<std::string>& get_ld_library_paths() const;
const std::vector<std::string>& get_default_library_paths() const;
const std::vector<std::string>& get_permitted_paths() const;
const std::vector<std::string>& get_allowed_libs() const;
const std::vector<android_namespace_link_t>& linked_namespaces() const;
void add_linked_namespace(android_namespace_t* linked_namespace,
std::unordered_set<std::string> shared_lib_sonames,
bool allow_all_shared_libs);
void add_soinfo(soinfo* si);
void remove_soinfo(soinfo* si);
const soinfo_list_t& soinfo_list() const;
bool is_accessible(const std::string& path);
bool is_accessible(soinfo* si);
private:
std::string name_;
bool is_isolated_;
bool is_exempt_list_enabled_;
bool is_also_used_as_anonymous_;
std::vector<std::string> ld_library_paths_;
std::vector<std::string> default_library_paths_;
std::vector<std::string> permitted_paths_;
std::vector<std::string> allowed_libs_;
std::vector<android_namespace_link_t> linked_namespaces_;
soinfo_list_t soinfo_list_;
};
Key concepts:
-
Isolated namespace: When
is_isolated_is true, the namespace can only load libraries from itsdefault_library_paths_andpermitted_paths_. This prevents vendor code from accidentally loading platform libraries. -
Namespace links: Libraries from one namespace can be made visible to another through links. Each link specifies which libraries are shared:
struct android_namespace_link_t {
android_namespace_t* linked_namespace_;
std::unordered_set<std::string> shared_lib_sonames_;
bool allow_all_shared_libs_;
bool is_accessible(const char* soname) const {
return allow_all_shared_libs_ ||
shared_lib_sonames_.find(soname) != shared_lib_sonames_.end();
}
};
- Allowed libs: An additional filter on which libraries can be loaded into the namespace, regardless of path.
7.4.3 Namespace Architecture¶
The standard Android namespace topology looks like this:
graph TD
subgraph "System Section"
SYS["default<br/>(system namespace)"]
VNDK["vndk<br/>(VNDK libraries)"]
VNDK_PROD["vndk_product<br/>(Product VNDK)"]
SPHAL["sphal<br/>(Same-Process HAL)"]
RS["rs<br/>(RenderScript)"]
end
subgraph "Vendor Section"
VDEF["default<br/>(vendor namespace)"]
VVNDK["vndk<br/>(vendor VNDK)"]
end
subgraph "APEX Namespaces"
APEX["com.android.art<br/>(ART Runtime)"]
APEX2["com.android.vndk.vXX<br/>(VNDK APEX)"]
end
SYS -->|"libc.so, libm.so, libdl.so"| VNDK
SYS -->|"libc.so, libm.so, libdl.so"| VNDK_PROD
SYS -->|"libc.so, libm.so, libdl.so"| SPHAL
SYS -->|"libc.so, libm.so, libdl.so"| RS
VDEF -->|"LLNDK libraries"| SYS
VDEF -->|"VNDK-SP, VNDK-core"| VVNDK
VVNDK -->|"all shared libs"| VDEF
SPHAL -->|"LLNDK"| SYS
style SYS fill:#e1f5fe
style VDEF fill:#fff3e0
style VNDK fill:#f3e5f5
style APEX fill:#e8f5e9
7.4.4 VNDK Library Categories¶
The VNDK (Vendor NDK) defines four categories of libraries:
From build/soong/cc/vndk.go (lines 23-29):
const (
llndkLibrariesTxt = "llndk.libraries.txt"
vndkCoreLibrariesTxt = "vndkcore.libraries.txt"
vndkSpLibrariesTxt = "vndksp.libraries.txt"
vndkPrivateLibrariesTxt = "vndkprivate.libraries.txt"
vndkProductLibrariesTxt = "vndkproduct.libraries.txt"
)
| Category | Description | Example Libraries |
|---|---|---|
| LL-NDK | Low-Level NDK; always available to vendor | libc.so, libm.so, libdl.so, liblog.so |
| VNDK-core | Core VNDK; available to vendor but versioned | libcutils.so, libbase.so, libutils.so |
| VNDK-SP | Same-Process VNDK; loaded into the framework process | libhardware.so, libhidlbase.so |
| VNDK-private | Available only to other VNDK modules, not to vendor directly | Internal VNDK implementation libraries |
The VndkProperties structure in the build system defines how a library
declares its VNDK membership:
From build/soong/cc/vndk.go (lines 45-76):
type VndkProperties struct {
Vndk struct {
// declared as a VNDK or VNDK-SP module
Enabled *bool
// declared as a VNDK-SP module, which is a subset of VNDK
Support_system_process *bool
// declared as a VNDK-private module
Private *bool
// Extending another module
Extends *string
}
}
7.4.5 The linkerconfig Tool¶
The system/linkerconfig/ tool generates the linker configuration at boot
time. It is invoked by init during the early boot sequence and produces
/linkerconfig/ld.config.txt.
From system/linkerconfig/main.cc (lines 33-43):
#include "linkerconfig/apex.h"
#include "linkerconfig/apexconfig.h"
#include "linkerconfig/baseconfig.h"
#include "linkerconfig/configparser.h"
#include "linkerconfig/context.h"
#include "linkerconfig/environment.h"
#include "linkerconfig/namespacebuilder.h"
#include "linkerconfig/recovery.h"
#include "linkerconfig/variableloader.h"
#include "linkerconfig/variables.h"
The tool uses a modular builder pattern. Each namespace has a dedicated builder
in system/linkerconfig/contents/namespace/:
| Builder File | Namespace | Purpose |
|---|---|---|
systemdefault.cc |
default (system) |
Framework code |
vendordefault.cc |
default (vendor) |
Vendor binaries |
vndk.cc |
vndk / vndk_product |
VNDK libraries |
sphal.cc |
sphal |
Same-process HALs |
rs.cc |
rs |
RenderScript |
apexdefault.cc |
APEX-specific | Per-APEX namespaces |
productdefault.cc |
default (product) |
Product partition |
recoverydefault.cc |
default (recovery) |
Recovery mode |
isolateddefault.cc |
default (isolated) |
Isolated processes |
7.4.6 Bionic Library Links¶
Every namespace needs access to the core Bionic libraries. This is configured
by the AddStandardSystemLinks function:
From system/linkerconfig/contents/common/system_links.cc (lines 29-62):
const std::vector<std::string> kBionicLibs = {
"libc.so",
"libdl.so",
"libdl_android.so",
"libm.so",
};
void AddStandardSystemLinks(const Context& ctx, Section* section) {
const std::string system_ns_name = ctx.GetSystemNamespaceName();
section->ForEachNamespaces([&](Namespace& ns) {
if (ns.GetName() != system_ns_name) {
ns.GetLink(system_ns_name).AddSharedLib(kBionicLibs);
}
});
}
This ensures that every namespace can resolve Bionic's core libraries through a link to the system namespace. Without this, basic C library functions would be unavailable.
7.4.7 System Namespace Configuration¶
The system (default) namespace for framework code is configured in
system/linkerconfig/contents/namespace/systemdefault.cc.
From system/linkerconfig/contents/namespace/systemdefault.cc (lines 31-78):
void SetupSystemPermittedPaths(Namespace* ns) {
const std::vector<std::string> permitted_paths = {
"/system/${LIB}/drm",
"/system/${LIB}/extractors",
"/system/${LIB}/hw",
system_ext + "/${LIB}",
// Where odex files are located (libart needs to dlopen them)
"/system/framework",
"/system/app",
"/system/priv-app",
system_ext + "/framework",
system_ext + "/app",
system_ext + "/priv-app",
"/vendor/framework",
"/vendor/app",
"/vendor/priv-app",
"/odm/framework",
"/odm/app",
"/odm/priv-app",
product + "/framework",
product + "/app",
product + "/priv-app",
"/data",
"/mnt/expand",
"/apex/com.android.runtime/${LIB}/bionic",
"/system/${LIB}/bootstrap",
};
Note the explicit comment about VNDK isolation:
// We can't have entire /system/${LIB} as permitted paths because
// doing so makes it possible to load libs in /system/${LIB}/vndk*
// directories by their absolute paths. VNDK libs are built with
// previous versions of Android and thus must not be loaded into
// this namespace.
This is the security boundary in action: even though the system namespace has broad permissions, it deliberately excludes VNDK directories to prevent version mixing.
7.4.8 Vendor Namespace Configuration¶
Vendor processes run in their own namespace with strict isolation:
From system/linkerconfig/contents/namespace/vendordefault.cc (lines 35-68):
Namespace BuildVendorNamespace(const Context& ctx,
const std::string& name) {
Namespace ns(name, /*is_isolated=*/true, /*is_visible=*/true);
ns.AddSearchPath("/odm/${LIB}");
ns.AddSearchPath("/vendor/${LIB}");
ns.AddSearchPath("/vendor/${LIB}/hw");
ns.AddSearchPath("/vendor/${LIB}/egl");
ns.AddPermittedPath("/odm");
ns.AddPermittedPath("/vendor");
ns.AddPermittedPath("/system/vendor");
// Links to other namespaces
ns.GetLink("rs").AddSharedLib("libRS_internal.so");
ns.AddRequires(base::Split(
Var("LLNDK_LIBRARIES_VENDOR", ""), ":"));
if (IsVendorVndkVersionDefined()) {
ns.GetLink(ctx.GetSystemNamespaceName())
.AddSharedLib(Var("SANITIZER_DEFAULT_VENDOR"));
ns.GetLink("vndk").AddSharedLib({
Var("VNDK_SAMEPROCESS_LIBRARIES_VENDOR"),
Var("VNDK_CORE_LIBRARIES_VENDOR")});
}
return ns;
}
The vendor namespace:
- Is isolated (
is_isolated=true) -- can only load from listed paths - Can search
/odm/${LIB}and/vendor/${LIB}(plus hw/egl subdirectories) - Has links to:
- The system namespace for LL-NDK libraries (libc, libm, libdl, liblog)
- The VNDK namespace for versioned VNDK libraries
- The RenderScript namespace for
libRS_internal.so
7.4.9 VNDK Namespace Configuration¶
The VNDK namespace is where versioned VNDK libraries live:
From system/linkerconfig/contents/namespace/vndk.cc (lines 30-123):
Namespace BuildVndkNamespace(const Context& ctx,
VndkUserPartition vndk_user) {
const char* name;
if (is_system_or_unrestricted_section &&
vndk_user == VndkUserPartition::Product) {
name = "vndk_product";
} else {
name = "vndk";
}
Namespace ns(name, /*is_isolated=*/true,
/*is_visible=*/is_system_or_unrestricted_section);
// Search order:
// 1. VNDK Extensions (vendor/lib/vndk-sp, vendor/lib/vndk)
// 2. VNDK APEX (/apex/com.android.vndk.vXX/${LIB})
// 3. vendor/lib or product/lib for extensions
for (const auto& lib_path : lib_paths) {
ns.AddSearchPath(lib_path + "/vndk-sp");
if (!is_system_or_unrestricted_section) {
ns.AddSearchPath(lib_path + "/vndk");
}
}
ns.AddSearchPath("/apex/com.android.vndk.v" + vndk_version + "/${LIB}");
The VNDK namespace search order reveals the extension mechanism:
- VNDK Extensions (
/vendor/${LIB}/vndk-sp) -- Vendor-provided replacements or extensions of VNDK libraries - VNDK APEX (
/apex/com.android.vndk.vXX/${LIB}) -- The canonical VNDK libraries, shipped as an APEX module - Fallback -- Vendor's own library directory for libraries that VNDK extensions depend on
The vndk_product variant is a parallel namespace for product-partition apps,
which may use a different VNDK version than vendor code.
7.4.10 The Exempt List: Backward Compatibility¶
The linker includes an exempt list for backward compatibility:
From bionic/linker/linker.cpp (lines 226-268):
static bool is_exempt_lib(android_namespace_t* ns, const char* name,
const soinfo* needed_by) {
static const char* const kLibraryExemptList[] = {
"libandroid_runtime.so",
"libbinder.so",
"libcrypto.so",
"libcutils.so",
"libexpat.so",
"libgui.so",
"libmedia.so",
"libnativehelper.so",
"libssl.so",
"libstagefright.so",
"libsqlite.so",
"libui.so",
"libutils.so",
nullptr
};
// If you're targeting N, you don't get the exempt-list.
if (get_application_target_sdk_version() >= 24) {
return false;
}
// ...
}
Apps targeting API level 23 (Marshmallow) or lower are allowed to access these platform libraries directly, even though they are not part of the NDK. This was necessary because many pre-Treble apps depended on these private libraries. Apps targeting API level 24 (Nougat) or higher are subject to strict namespace isolation.
7.4.11 How Namespaces Interact with dlopen¶
When an application calls dlopen("libfoo.so", RTLD_NOW), the following
namespace-aware logic executes:
- The linker determines the caller's namespace from the return address
- It searches the caller's namespace paths
- If not found, it checks linked namespaces, but only for libraries in the link's shared_lib_sonames set
- If the library is in an isolated namespace, the linker verifies it is on an accessible path
The accessibility check:
From bionic/linker/linker.cpp (lines 1221-1249):
if ((fs_stat.f_type != TMPFS_MAGIC) && (!ns->is_accessible(realpath))) {
const soinfo* needed_by = task->is_dt_needed() ?
task->get_needed_by() : nullptr;
if (is_exempt_lib(ns, name, needed_by)) {
// Allow with warning for legacy apps
} else {
DL_OPEN_ERR("library \"%s\" needed or dlopened by \"%s\" is not "
"accessible for the namespace \"%s\"",
name, needed_or_dlopened_by, ns->get_name());
}
}
Note the TMPFS_MAGIC exception: libraries loaded from tmpfs (created via
memfd_create()) bypass the accessibility check. This enables apps to create
libraries at runtime (e.g., JIT compilation) without needing a writable
directory on the library search path.
7.4.12 Runtime Namespace Creation¶
Applications and the framework can create new namespaces at runtime through
the android_create_namespace API:
From bionic/linker/dlfcn.cpp (lines 51-57):
android_namespace_t* __loader_android_create_namespace(
const char* name,
const char* ld_library_path,
const char* default_library_path,
uint64_t type,
const char* permitted_when_isolated_path,
android_namespace_t* parent_namespace,
const void* caller_addr) __LINKER_PUBLIC__;
This is used by libnativeloader, which creates per-app namespaces with
appropriate isolation. Each app gets its own namespace that can see:
- The app's own native libraries (from the APK)
- LL-NDK libraries (via link to system namespace)
- VNDK libraries (if the app uses the NDK)
- Libraries listed in the app's
uses-native-librarymanifest entries
7.4.13 Default Library Paths¶
The linker defines default library search paths based on the device's configuration:
From bionic/linker/linker.cpp (lines 105-154):
#if defined(__LP64__)
static const char* const kSystemLibDir = "/system/lib64";
static const char* const kOdmLibDir = "/odm/lib64";
static const char* const kVendorLibDir = "/vendor/lib64";
static const char* const kAsanSystemLibDir = "/data/asan/system/lib64";
static const char* const kAsanOdmLibDir = "/data/asan/odm/lib64";
static const char* const kAsanVendorLibDir = "/data/asan/vendor/lib64";
#else
static const char* const kSystemLibDir = "/system/lib";
// ...
#endif
static const char* const kDefaultLdPaths[] = {
kSystemLibDir,
kOdmLibDir,
kVendorLibDir,
nullptr
};
static const char* const kAsanDefaultLdPaths[] = {
kAsanSystemLibDir,
kSystemLibDir,
kAsanOdmLibDir,
kOdmLibDir,
kAsanVendorLibDir,
kVendorLibDir,
nullptr
};
#if defined(__aarch64__)
static const char* const kHwasanSystemLibDir = "/system/lib64/hwasan";
static const char* const kHwasanOdmLibDir = "/odm/lib64/hwasan";
static const char* const kHwasanVendorLibDir = "/vendor/lib64/hwasan";
#endif
There are three sets of paths:
- Default -- Normal operation:
/system/lib64,/odm/lib64,/vendor/lib64 - ASan -- AddressSanitizer mode: ASan-instrumented libraries in
/data/asan/are searched first, falling back to the normal paths - HWASan -- Hardware AddressSanitizer mode (AArch64 only): HWASan-instrumented
libraries in
hwasan/subdirectories are searched first
This allows sanitized builds to coexist with production builds on the same device, with the sanitized versions taking priority when the sanitizer is enabled.
7.4.14 Namespace Isolation in Practice¶
Here is a concrete example of how namespace isolation works for a vendor process on a Treble-compliant device:
graph TD
subgraph "Vendor Process (/vendor/bin/camera_server)"
VP["camera_server<br/>Namespace: vendor/default"]
end
subgraph "vendor/default namespace"
VL1["libcamera_hal.so<br/>/vendor/lib64/hw/"]
VL2["libqcom_camera.so<br/>/vendor/lib64/"]
end
subgraph "vndk namespace"
VNDK1["libcutils.so<br/>/apex/com.android.vndk.v34/lib64/"]
VNDK2["libutils.so<br/>/apex/com.android.vndk.v34/lib64/"]
end
subgraph "system namespace"
SYS1["libc.so<br/>/system/lib64/"]
SYS2["libm.so<br/>/system/lib64/"]
SYS3["liblog.so<br/>/system/lib64/"]
end
VP --> VL1
VP --> VL2
VL1 -->|"DT_NEEDED"| VNDK1
VL1 -->|"DT_NEEDED"| VNDK2
VNDK1 -->|"LL-NDK link"| SYS1
VNDK1 -->|"LL-NDK link"| SYS2
VL2 -->|"LL-NDK link"| SYS3
VP -.->|"BLOCKED"| SYS_PRIV["libandroid_runtime.so<br/>/system/lib64/"]
style VP fill:#fff3e0
style VL1 fill:#fff3e0
style VL2 fill:#fff3e0
style VNDK1 fill:#f3e5f5
style VNDK2 fill:#f3e5f5
style SYS1 fill:#e1f5fe
style SYS2 fill:#e1f5fe
style SYS3 fill:#e1f5fe
style SYS_PRIV fill:#ffcdd2
In this scenario:
camera_serverlives in the vendor/default namespace- It can load its own vendor libraries (
libcamera_hal.so,libqcom_camera.so) - Those libraries can use VNDK libraries (
libcutils.so,libutils.so) through the vndk namespace link - Everyone can use LL-NDK libraries (
libc.so,libm.so,liblog.so) through links to the system namespace - Direct access to platform-private libraries (
libandroid_runtime.so) is blocked by namespace isolation
7.4.15 VNDK Deprecation and Evolution¶
The VNDK system is evolving. Recent AOSP versions include a --deprecate_vndk
flag in linkerconfig:
From system/linkerconfig/main.cc (lines 62-63):
The trend is toward using APEX modules for library versioning rather than the VNDK mechanism. Each APEX can carry its own versions of libraries, isolated in their own mount namespace and linker namespace. This provides stronger isolation than VNDK (which shares a single process address space) and better supports independent updates.
However, VNDK remains essential for backward compatibility with existing vendor implementations and will likely coexist with APEX-based solutions for multiple Android generations.
7.4.16 Putting It All Together: The Library Loading Decision Tree¶
When the linker encounters a DT_NEEDED entry or dlopen call, the complete
decision process is:
graph TD
A["Need library: libfoo.so"] --> B{Name contains '/'?}
B -->|Yes| C["Open directly at path"]
B -->|No| D["Search LD_LIBRARY_PATH"]
D --> E{Found?}
E -->|Yes| F["Check namespace accessibility"]
E -->|No| G["Search DT_RUNPATH"]
G --> H{Found?}
H -->|Yes| F
H -->|No| I["Search namespace default paths"]
I --> J{Found?}
J -->|Yes| K["No accessibility check needed<br/>(default paths are always accessible)"]
J -->|No| L["Search linked namespaces"]
L --> M{Found in linked ns?}
M -->|Yes| N{In shared_lib_sonames?}
N -->|Yes| O["Use library from linked namespace"]
N -->|No| P["Library not accessible"]
M -->|No| Q["Library not found"]
F --> R{Namespace isolated?}
R -->|No| S["Load library"]
R -->|Yes| T{Path in permitted_paths?}
T -->|Yes| S
T -->|No| U{Legacy exempt?}
U -->|Yes, SDK < 24| V["Load with warning"]
U -->|No| P
C --> F
K --> S
style S fill:#c8e6c9
style O fill:#c8e6c9
style V fill:#fff9c4
style P fill:#ffcdd2
style Q fill:#ffcdd2
7.4.17 Segment Loading In Detail¶
The LoadSegments() method in the ElfReader class iterates over every PT_LOAD
program header and maps the corresponding file region into the reserved address
space.
From bionic/linker/linker_phdr.cpp (lines 987-1086):
bool ElfReader::LoadSegments() {
size_t seg_align = should_use_16kib_app_compat_ ?
kCompatPageSize : kPageSize;
if (kPageSize >= 16384 && min_align_ < kPageSize &&
!should_use_16kib_app_compat_) {
DL_ERR_AND_LOG(
"\"%s\" program alignment (%zu) cannot be smaller than "
"system page size (%zu)", name_.c_str(), min_align_, kPageSize);
return false;
}
for (size_t i = 0; i < phdr_num_; ++i) {
const ElfW(Phdr)* phdr = &phdr_table_[i];
if (phdr->p_type != PT_LOAD) continue;
ElfW(Addr) p_memsz = phdr->p_memsz;
ElfW(Addr) p_filesz = phdr->p_filesz;
_extend_load_segment_vma(phdr_table_, phdr_num_, i, &p_memsz,
&p_filesz, should_pad_segments_,
should_use_16kib_app_compat_);
// Segment addresses in memory
ElfW(Addr) seg_start = phdr->p_vaddr + load_bias_;
ElfW(Addr) seg_end = seg_start + p_memsz;
ElfW(Addr) seg_page_end = __builtin_align_up(seg_end, seg_align);
ElfW(Addr) seg_file_end = seg_start + p_filesz;
if (file_length != 0) {
int prot = PFLAGS_TO_PROT(phdr->p_flags);
if ((prot & (PROT_EXEC | PROT_WRITE)) == (PROT_EXEC | PROT_WRITE)) {
if (DL_ERROR_AFTER(26, "\"%s\" has load segments that are both "
"writable and executable", name_.c_str())) {
return false;
}
}
if (should_use_16kib_app_compat_) {
if (!CompatMapSegment(i, file_length)) return false;
} else {
if (!MapSegment(i, file_length)) return false;
}
}
ZeroFillSegment(phdr);
DropPaddingPages(phdr, seg_file_end);
if (!MapBssSection(phdr, seg_page_end, seg_file_end)) return false;
}
return true;
}
Each PT_LOAD segment goes through four sub-operations:
-
MapSegment / CompatMapSegment -- Maps the file content into the address space using
mmap64()withMAP_FIXED. For 16KiB compatibility mode, the compat path reads data into an existing anonymous mapping instead of usingmmapdirectly. -
ZeroFillSegment -- If the segment is writable and its file size is less than a page boundary, the remainder of the partial page must be zeroed. This is required by the ELF specification for BSS-like data.
-
DropPaddingPages -- When segment extension is active (for page size migration), padding pages between segments are released using
MADV_DONTNEEDto reduce memory pressure. -
MapBssSection -- If
p_memsz > p_filesz, the excess represents BSS data. The linker maps additional anonymous pages at the end of the segment and names them.bssusingprctl(PR_SET_VMA).
MapSegment in detail:
From bionic/linker/linker_phdr.cpp (lines 868-893):
bool ElfReader::MapSegment(size_t seg_idx, size_t len) {
const ElfW(Phdr)* phdr = &phdr_table_[seg_idx];
void* start = reinterpret_cast<void*>(
page_start(phdr->p_vaddr + load_bias_));
const ElfW(Addr) offset = file_offset_ +
page_start(phdr->p_offset);
int prot = PFLAGS_TO_PROT(phdr->p_flags);
void* seg_addr = mmap64(start, len, prot,
MAP_FIXED | MAP_PRIVATE, fd_, offset);
if (seg_addr == MAP_FAILED) {
DL_ERR("couldn't map \"%s\" segment %zd: %m",
name_.c_str(), seg_idx);
return false;
}
// Mark segments as huge page eligible
if ((phdr->p_flags & PF_X) && phdr->p_align == kPmdSize &&
get_transparent_hugepages_supported()) {
madvise(seg_addr, len, MADV_HUGEPAGE);
}
return true;
}
Note the transparent huge page support: executable segments aligned to PMD
size (2MB) receive MADV_HUGEPAGE, which tells the kernel to use huge pages
for these mappings. This reduces TLB misses for large code sections.
W+E segment rejection:
The linker rejects libraries with segments that are simultaneously writable and
executable (W+E), starting from API level 26. This is a security measure:
W+E segments would allow an attacker who can write to memory to also execute
that memory, defeating W^X protections.
Segment extension for page size migration:
The _extend_load_segment_vma function extends the file-backed portion of a
segment to fill the gap between adjacent PT_LOAD segments. This is necessary
because on a system with a larger page size than the ELF was built for, the
gap between segments would be mapped as separate VMAs (Virtual Memory Areas),
consuming kernel slab memory. By extending segments to be contiguous, the
kernel can merge them into a single VMA:
From bionic/linker/linker_phdr.cpp (lines 817-866):
static inline void _extend_load_segment_vma(
const ElfW(Phdr)* phdr_table, size_t phdr_count,
size_t phdr_idx, ElfW(Addr)* p_memsz,
ElfW(Addr)* p_filesz, bool should_pad_segments,
bool should_use_16kib_app_compat) {
if (should_use_16kib_app_compat) return;
const ElfW(Phdr)* phdr = &phdr_table[phdr_idx];
// Don't do extension for p_align > 64KiB
if (phdr->p_align <= kPageSize || phdr->p_align > 64*1024 ||
!should_pad_segments) {
return;
}
// Find next PT_LOAD segment
const ElfW(Phdr)* next = nullptr;
if (phdr_idx + 1 < phdr_count &&
phdr_table[phdr_idx + 1].p_type == PT_LOAD) {
next = &phdr_table[phdr_idx + 1];
}
if (!next || *p_memsz != *p_filesz) return;
ElfW(Addr) next_start = page_start(next->p_vaddr);
ElfW(Addr) curr_end = page_end(phdr->p_vaddr + *p_memsz);
if (curr_end >= next_start) return;
// Extend to be contiguous
ElfW(Addr) extend = next_start - curr_end;
*p_memsz += extend;
*p_filesz += extend;
}
7.4.18 The find_libraries Algorithm¶
The find_libraries function is the workhorse of dependency resolution. It
implements a multi-phase algorithm that handles circular dependencies,
cross-namespace loading, and load shuffling for ASLR.
From bionic/linker/linker.cpp (lines 1459-1528):
static bool find_library_internal(android_namespace_t* ns,
LoadTask* task,
ZipArchiveCache* zip_archive_cache,
LoadTaskList* load_tasks,
int rtld_flags) {
soinfo* candidate;
// Phase 1: Check if already loaded (by soname)
if (find_loaded_library_by_soname(ns, task->get_name(),
true /* search_linked_namespaces */, &candidate)) {
task->set_soinfo(candidate);
return true;
}
// Phase 2: Try to load from this namespace
if (load_library(ns, task, zip_archive_cache, load_tasks,
rtld_flags, true)) {
return true;
}
// Phase 3: Exempt list fallback for legacy apps
if (ns->is_exempt_list_enabled() &&
is_exempt_lib(ns, task->get_name(), task->get_needed_by())) {
ns = &g_default_namespace;
if (load_library(ns, task, zip_archive_cache, load_tasks,
rtld_flags, true)) {
return true;
}
}
// Phase 4: Search linked namespaces
for (auto& linked_namespace : ns->linked_namespaces()) {
if (find_library_in_linked_namespace(linked_namespace, task)) {
if (task->get_soinfo() != nullptr) {
return true; // Already loaded
}
// Ok to load in linked namespace
if (load_library(linked_namespace.linked_namespace(), task,
zip_archive_cache, load_tasks, rtld_flags,
false)) {
return true;
}
}
}
return false;
}
The four phases represent a carefully ordered fallback chain:
graph TD
A["find_library_internal()"] --> B{"Already loaded<br/>by soname?"}
B -->|Yes| C["Return existing soinfo"]
B -->|No| D{"Can load from<br/>this namespace?"}
D -->|Yes| E["Load and return"]
D -->|No| F{"Exempt list<br/>enabled?"}
F -->|Yes| G{In exempt list?}
G -->|Yes| H["Switch to default namespace<br/>and retry"]
G -->|No| I["Try linked namespaces"]
F -->|No| I
H -->|Found| E
H -->|Not found| I
I --> J{"Found in linked ns<br/>and accessible?"}
J -->|Yes, already loaded| C
J -->|Yes, needs loading| K["Load in linked namespace"]
J -->|No more links| L["Return false<br/>(library not found)"]
style C fill:#c8e6c9
style E fill:#c8e6c9
style K fill:#c8e6c9
style L fill:#ffcdd2
Load shuffling for ASLR:
After all LoadTasks have been created but before they are loaded, the linker shuffles the load order:
From bionic/linker/linker.cpp (lines 1532-1543):
static void shuffle(std::vector<LoadTask*>* v) {
if (is_first_stage_init()) {
// arc4random* is not available in first stage init
return;
}
for (size_t i = 0, size = v->size(); i < size; ++i) {
size_t n = size - i;
size_t r = arc4random_uniform(n);
std::swap((*v)[n-1], (*v)[r]);
}
}
This randomizes the order in which libraries are mapped into memory,
complementing the per-library ASLR from ReserveWithAlignmentPadding. Even if
an attacker knows which libraries a process loads, the order is unpredictable.
7.4.19 Duplicate Detection and the Soname Contract¶
The linker uses two strategies to detect if a library is already loaded:
By inode (strongest):
From bionic/linker/linker.cpp (lines 1106-1137):
static bool find_loaded_library_by_inode(android_namespace_t* ns,
const struct stat& file_stat,
off64_t file_offset,
bool search_linked_namespaces,
soinfo** candidate) {
auto predicate = [&](soinfo* si) {
return si->get_st_ino() == file_stat.st_ino &&
si->get_st_dev() == file_stat.st_dev &&
si->get_file_offset() == file_offset;
};
*candidate = ns->soinfo_list().find_if(predicate);
if (*candidate == nullptr && search_linked_namespaces) {
for (auto& link : ns->linked_namespaces()) {
android_namespace_t* linked_ns = link.linked_namespace();
soinfo* si = linked_ns->soinfo_list().find_if(predicate);
if (si != nullptr && link.is_accessible(si->get_soname())) {
*candidate = si;
return true;
}
}
}
return *candidate != nullptr;
}
By realpath (fallback):
static bool find_loaded_library_by_realpath(android_namespace_t* ns,
const char* realpath,
bool search_linked_namespaces,
soinfo** candidate) {
auto predicate = [&](soinfo* si) {
return strcmp(realpath, si->get_realpath()) == 0;
};
// ...
}
The inode-based check handles symlinks and hard links correctly: if
/system/lib64/libfoo.so and /system/lib64/libfoo_v2.so are hard links
to the same file, inode detection ensures only one copy is loaded. The
realpath check handles the case where proc is not mounted (early boot).
7.4.20 DT_NEEDED Processing and DT_RUNPATH¶
When a library is first loaded, the linker scans its .dynamic section for
DT_NEEDED entries (libraries it depends on) and DT_RUNPATH (additional search
paths):
From bionic/linker/linker.cpp (lines 1276-1310):
const ElfReader& elf_reader = task->get_elf_reader();
for (const ElfW(Dyn)* d = elf_reader.dynamic();
d->d_tag != DT_NULL; ++d) {
if (d->d_tag == DT_RUNPATH) {
si->set_dt_runpath(elf_reader.get_string(d->d_un.d_val));
}
if (d->d_tag == DT_SONAME) {
si->set_soname(elf_reader.get_string(d->d_un.d_val));
}
if (d->d_tag == DT_FLAGS_1) {
si->set_dt_flags_1(d->d_un.d_val);
}
}
for (const ElfW(Dyn)* d = elf_reader.dynamic();
d->d_tag != DT_NULL; ++d) {
if (d->d_tag == DT_NEEDED) {
const char* name = fix_dt_needed(
elf_reader.get_string(d->d_un.d_val), elf_reader.name());
load_tasks->push_back(
LoadTask::create(name, si, ns, task->get_readers_map()));
}
}
DT_FLAGS_1 is checked early because the DF_1_GLOBAL flag determines
whether the library should be visible in the global scope. This must be known
before the library's dependencies are loaded so that namespace linking is
correct.
The fix_dt_needed function handles a backward compatibility issue: some
older 32-bit libraries had DT_NEEDED entries with absolute paths instead of
bare sonames. For apps targeting API level 22 or lower, the function strips
the directory component.
7.4.21 GDB Integration¶
The linker maintains a debug data structure that GDB uses to discover loaded
libraries. This is the link_map structure, part of the standard ELF debugging
interface.
From bionic/linker/linker_main.cpp (lines 207-215):
static void init_link_map_head(soinfo& info) {
auto& map = info.link_map_head;
map.l_addr = info.load_bias;
map.l_name = const_cast<char*>(info.get_realpath());
phdr_table_get_dynamic_section(info.phdr, info.phnum,
info.load_bias, &map.l_ld, nullptr);
}
Every soinfo contains a link_map_head that forms part of a doubly-linked
list. GDB reads this list through the r_debug structure (exposed as
_r_debug in the linker's symbol table) to enumerate loaded libraries, set
breakpoints in newly-loaded code, and resolve symbol addresses.
When a library is loaded or unloaded, the linker calls notify_gdb_of_load
or notify_gdb_of_unload, which update the r_debug state and trigger a
breakpoint that GDB can catch:
From bionic/linker/linker.cpp (lines 274-295):
static void notify_gdb_of_load(soinfo* info) {
if (info->is_linker() || info->is_main_executable()) {
return;
}
link_map* map = &(info->link_map_head);
map->l_addr = info->load_bias;
map->l_name = const_cast<char*>(info->get_realpath());
map->l_ld = info->dynamic;
CHECK(map->l_name != nullptr);
CHECK(map->l_name[0] != '\0');
notify_gdb_of_load(map);
}
7.4.22 CFI (Control Flow Integrity) Shadow¶
The linker maintains a CFI shadow -- a data structure that enables LLVM's Control Flow Integrity checks at runtime:
From bionic/linker/linker.cpp (lines 173-177):
After all libraries are loaded and linked, the linker initializes the CFI shadow:
From bionic/linker/linker_main.cpp (line 503):
The CFI shadow maps each executable page to a shadow entry that records which
indirect call targets are valid. When a CFI-instrumented library makes an
indirect call, it checks the shadow to verify the target is a valid function
entry point. Invalid targets trigger a controlled crash via
__loader_cfi_fail.
7.4.23 TLS (Thread-Local Storage) in the Linker¶
The linker manages ELF TLS (Thread-Local Storage) for all loaded libraries.
TLS variables declared with __thread or thread_local in C/C++ require
per-thread copies, and the linker allocates and initializes these.
From bionic/linker/linker_tls.h (lines 36-65):
void linker_setup_exe_static_tls(const char* progname);
void linker_finalize_static_tls();
void register_soinfo_tls(soinfo* si);
void unregister_soinfo_tls(soinfo* si);
const TlsModule& get_tls_module(size_t module_id);
struct TlsDescriptor {
#if defined(__arm__)
size_t arg;
TlsDescResolverFunc* func;
#else
TlsDescResolverFunc* func;
size_t arg;
#endif
};
struct TlsDynamicResolverArg {
size_t generation;
TlsIndex index;
};
extern "C" size_t tlsdesc_resolver_static(size_t);
extern "C" size_t tlsdesc_resolver_dynamic(size_t);
extern "C" size_t tlsdesc_resolver_unresolved_weak(size_t);
There are two TLS allocation strategies:
-
Static TLS -- For the executable and libraries loaded at startup. The total static TLS size is computed before any thread is created, and each thread's TLS block is pre-allocated as part of the thread stack.
-
Dynamic TLS -- For libraries loaded via
dlopen()after threads exist. These use a Dynamic Thread Vector (DTV) that is lazily extended when a thread first accesses TLS from a dlopen'd library.
The three TLSDESC resolvers handle different cases:
tlsdesc_resolver_static-- Fast path for static TLS (single offset add)tlsdesc_resolver_dynamic-- Slow path for dynamic TLS (may allocate)tlsdesc_resolver_unresolved_weak-- For weak TLS symbols that resolved to null (returns a dummy address)
7.4.24 MTE Globals Support¶
On AArch64 hardware with MTE (Memory Tagging Extension), the linker can tag global variables in loaded libraries:
From bionic/linker/linker_soinfo.h (line 70):
The apply_memtag_if_mte_globals method (used during relocation) checks if a
relocation target address falls within a tagged global region and applies the
appropriate tag. This catches buffer overflows on global variables at runtime.
From bionic/linker/linker_relocate.cpp (line 169):
void* const rel_target = reinterpret_cast<void*>(
relocator.si->apply_memtag_if_mte_globals(
reloc.r_offset + relocator.si->load_bias));
7.4.25 Debugging the Linker¶
The linker provides several debugging mechanisms:
LD_DEBUG environment variable:
Setting LD_DEBUG enables verbose logging. The value is a comma-separated
list of categories:
| Value | What it logs |
|---|---|
any |
All debug output |
lookup |
Symbol lookup results |
reloc |
Relocation processing |
timing |
Total link time in microseconds |
statistics |
Relocation counts (absolute, relative, symbol, cached) |
LD_SHOW_AUXV:
Setting this environment variable dumps the auxiliary vector at startup, showing AT_PHDR, AT_ENTRY, AT_BASE, AT_HWCAP, etc.
linker logging:
From bionic/linker/linker_main.cpp (lines 508-513):
if (g_linker_debug_config.timing) {
gettimeofday(&t1, nullptr);
long long t0_us = (t0.tv_sec * 1000000LL) + t0.tv_usec;
long long t1_us = (t1.tv_sec * 1000000LL) + t1.tv_usec;
LD_DEBUG(timing, "LINKER TIME: %s: %lld microseconds",
g_argv[0], t1_us - t0_us);
}
Note that LD_DEBUG and LD_SHOW_AUXV are only honored when AT_SECURE is
not set (i.e., for non-setuid/non-setgid processes). This prevents information
leakage from privileged processes.
7.4.26 The ldd Tool¶
Android's linker includes a built-in ldd equivalent. When invoked as
linker64 --list /path/to/binary, the linker sets the g_is_ldd flag:
From bionic/linker/linker_main.cpp (lines 489-492):
// Exit early for ldd. We don't want to run the code that was loaded,
// so skip the constructor calls. Skip CFI setup because it would call
// __cfi_init in libdl.so.
if (g_is_ldd) _exit(EXIT_SUCCESS);
In ldd mode, the linker loads all dependencies (printing their paths as it goes) but exits before calling constructors. This safely reveals the dependency tree without executing any library code.
7.4.27 Linker Namespace Lifecycle¶
Namespaces have a defined lifecycle during process startup and at runtime:
sequenceDiagram
participant LM as linker_main()
participant LC as linker_config
participant NS as Namespaces
participant NL as nativeloader
LM->>LC: Read /linkerconfig/ld.config.txt
LC->>NS: Create default namespace
LC->>NS: Create vndk namespace
LC->>NS: Create sphal namespace
LC->>NS: Create per-APEX namespaces
LC->>NS: Establish namespace links
Note over LM,NS: Process startup complete
NL->>NS: android_create_namespace("classloader-namespace")
NL->>NS: android_link_namespaces(classloader, system, shared_libs)
NL->>NS: android_link_namespaces(classloader, vndk, vndk_libs)
Note over NL,NS: App class loading ready
The initial namespaces are created from the linker configuration file during
init_default_namespaces(). Later, when the Java class loader loads native
libraries for an app, libnativeloader calls android_create_namespace to
create an app-specific namespace and links it to the system and VNDK namespaces
with appropriate library allowlists.
Summary¶
This chapter has traced the path from the lowest levels of Android's native
execution environment -- the system call stubs generated from SYSCALLS.TXT,
the seccomp-BPF filters that constrain which calls are permitted -- through
the C library that provides the POSIX foundation, and up to the dynamic linker
that orchestrates library loading, symbol resolution, and namespace isolation.
The key takeaways:
-
Bionic is purpose-built for Android. Its BSD license, small size, fast startup, and deep Android integration make it fundamentally different from glibc. The architecture-specific IFUNC dispatch (with paths for MOPS, Oryon, NEON, MTE) demonstrates the performance engineering invested in core operations.
-
The system call interface is generated, not hand-written. The
SYSCALLS.TXT+gensyscalls.pyapproach provides a single source of truth for all five architectures, with architecture-specific concerns (32-bit UID calls, socketcall multiplexing, time64 variants) handled declaratively. -
Seccomp-BPF creates a security boundary at the system call level. The allowlist/blocklist composition (with priority optimization for
futexandioctl) restricts the kernel attack surface for app processes, while the architecture-aware BPF programs handle dual-ABI systems. -
The dynamic linker is the gatekeeper for all native code. Its ElfReader validates and loads ELF files with ASLR enhancement, 16KiB page compatibility, and BTI support. The relocation engine uses template-based fast paths and symbol caching for performance.
-
Linker namespaces enforce the Treble architecture boundary. The
android_namespace_tstructure, configured bylinkerconfig, creates isolated worlds for platform, vendor, and product code. LL-NDK and VNDK libraries provide controlled interfaces between these worlds, while the exempt list maintains backward compatibility for legacy apps.
Together, these components form the native runtime foundation upon which every Android process executes. Understanding them is essential for anyone working on system-level Android development, debugging library loading issues, or implementing platform security features.
Architecture-Specific System Call Conventions¶
To aid readers working on specific architectures, here is a reference table of system call conventions across all five architectures supported by Bionic:
| Architecture | Syscall Number | Arg 1 | Arg 2 | Arg 3 | Arg 4 | Arg 5 | Arg 6 | Instruction | Return |
|---|---|---|---|---|---|---|---|---|---|
| arm | r7 | r0 | r1 | r2 | r3 | r4 | r5 | swi #0 |
r0 |
| arm64 | x8 | x0 | x1 | x2 | x3 | x4 | x5 | svc #0 |
x0 |
| x86 | eax | ebx | ecx | edx | esi | edi | ebp | int $0x80 |
eax |
| x86_64 | rax | rdi | rsi | rdx | r10 | r8 | r9 | syscall |
rax |
| riscv64 | a7 | a0 | a1 | a2 | a3 | a4 | a5 | ecall |
a0 |
On error, the return value is in the range [-4095, -1] (or [-MAX_ERRNO, -1]
in Bionic terms). Bionic stubs negate this value and store it in errno via
__set_errno_internal.
Note the x86 peculiarity: 32-bit x86 has only six registers available for
system call arguments, and socket operations are multiplexed through the
socketcall system call with a sub-command number. This multiplexing is
absent on all other architectures.
Linker Configuration File Format¶
For completeness, here is the grammar of the ld.config.txt file format
that the linker parses at startup:
config := section*
section := "[" name "]" newline property*
property := name "=" value newline
| name "+=" value newline
# Namespace properties
namespace.<ns>.search.paths = <colon-separated-paths>
namespace.<ns>.permitted.paths = <colon-separated-paths>
namespace.<ns>.asan.search.paths = <colon-separated-paths>
namespace.<ns>.asan.permitted.paths = <colon-separated-paths>
namespace.<ns>.hwasan.search.paths = <colon-separated-paths>
namespace.<ns>.hwasan.permitted.paths = <colon-separated-paths>
namespace.<ns>.isolated = true|false
namespace.<ns>.visible = true|false
namespace.<ns>.links = <comma-separated-ns-names>
namespace.<ns>.link.<target>.shared_libs = <colon-separated-libs>
namespace.<ns>.link.<target>.allow_all_shared_libs = true|false
namespace.<ns>.allowed_libs = <colon-separated-libs>
# Section selectors
dir.<section> = <path-prefix>
additional.namespaces = <comma-separated-ns-names>
The ${LIB} placeholder in paths is expanded to lib on 32-bit systems and
lib64 on 64-bit systems. The $ORIGIN placeholder is expanded to the
directory containing the requesting library.
Glossary of Key Terms¶
| Term | Definition |
|---|---|
| ASLR | Address Space Layout Randomization; randomizes memory layout |
| BTI | Branch Target Identification; ARM security feature |
| CFI | Control Flow Integrity; prevents indirect call hijacking |
| DT_NEEDED | Dynamic table entry listing a required dependency |
| DT_RUNPATH | Dynamic table entry with additional library search paths |
| ELF | Executable and Linkable Format; binary format for executables |
| GOT | Global Offset Table; stores resolved symbol addresses |
| IFUNC | Indirect Function; runtime-resolved function selection |
| LL-NDK | Low-Level NDK; always-available libraries for vendor |
| Load Bias | Offset between ELF virtual address and actual memory address |
| MTE | Memory Tagging Extension; ARM memory safety feature |
| PLT | Procedure Linkage Table; enables lazy symbol resolution |
| PMD | Page Middle Directory; 2MB page table entry |
| RELRO | Relocation Read-Only; security hardening for GOT |
| Seccomp-BPF | Secure Computing with Berkeley Packet Filter |
| soinfo | Shared Object Info; linker metadata for loaded libraries |
| soname | Shared Object Name; canonical library identifier |
| TLS | Thread-Local Storage; per-thread variables |
| VNDK | Vendor NDK; versioned library interface for Treble |
| VNDK-SP | VNDK Same-Process; libraries loaded in framework processes |
| VDSO | Virtual Dynamic Shared Object; kernel-mapped user-space syscalls |
| W^X | Write XOR Execute; security policy preventing W+E pages |
Further Reading and Cross-References¶
The topics covered in this chapter connect to several other chapters in this book:
-
Chapter 3 (Boot and Init): The init process is the first user-space process and one of the first consumers of Bionic and the dynamic linker. Understanding the linker's first-stage init special cases (no arc4random, no /proc) requires understanding the boot sequence.
-
Chapter 4 (Kernel): The system call interface described in Section 6.2 is the boundary between user space and kernel space. The seccomp-BPF filters are enforced by the kernel's seccomp infrastructure.
-
Chapter 8 (Binder IPC): Binder is the most frequent user of the
ioctlsystem call, which is whyioctlis in the seccomp priority list. The Binder driver's file descriptor is one of the first things any Android process opens after the linker hands off control. -
Chapter 9 (ART and Dalvik): The ART runtime uses
dlopen()extensively to load JNI libraries, andlibnativeloadercreates per-app linker namespaces. ART's OAT files are loaded through the same ELF loading pipeline described in Section 6.3. -
Chapter 11 (HAL and HIDL): The Same-Process HAL (SP-HAL) mechanism relies on the
sphallinker namespace to load vendor HAL implementations directly into framework processes while maintaining namespace isolation. -
Chapter 14 (Security): The memory safety features described in this chapter (MTE, CFI, FORTIFY_SOURCE, seccomp-BPF, W^X, RELRO) form the foundation of Android's native code security model. The linker's namespace isolation is also a key component of the Treble security boundary.
Understanding Bionic and the dynamic linker is foundational to understanding Android at the system level. Every native component -- from the init daemon to the most complex graphics pipeline -- passes through the code paths documented here.
7.5 Musl: The Host-Side Alternative to Bionic¶
While Bionic is Android's C library for device targets, AOSP also integrates musl libc as an alternative C library for host tool compilation. This section explains why musl exists in AOSP, how it's integrated, and when it's used instead of glibc.
7.5.1 Why Musl in AOSP?¶
Android's build system runs on Linux host machines. By default, host tools
(such as aapt2, dex2oat, or zipalign) are compiled against glibc,
the standard C library on most Linux distributions. However, glibc has
drawbacks for build tool distribution:
- Dynamic linking dependencies — glibc binaries depend on the host's exact glibc version, causing "GLIBC_2.XX not found" errors on older systems
- Large shared library footprint — glibc pulls in many shared objects
- Complex static linking — glibc discourages static linking and has known issues when linked statically (NSS, locale, dlopen)
Musl solves these problems:
- Clean static linking — musl is designed for static linking from the start
- Minimal dependencies — produces self-contained binaries
- Portable output — statically-linked musl binaries run on any Linux kernel version without glibc version concerns
7.5.2 Musl Source and Version¶
Musl lives at external/musl/ in the AOSP tree:
external/musl/
├── Android.bp # Build rules (622 lines)
├── sources.bp # Generated source file lists
├── README # Upstream v1.2.5
├── METADATA # Version and license info
├── android/ # Android-specific adaptations
│ ├── generate_bp.py # Generates sources.bp from upstream
│ ├── relinterp.c # Dynamic interpreter relocation
│ ├── ldso_trampoline.cpp # Loader trampoline
│ └── include/ # Android-specific header overrides
│ ├── features.h
│ ├── math.h
│ ├── resolv.h
│ └── string.h
├── include/ # musl public headers
├── src/ # musl source (upstream)
│ ├── string/ # String operations
│ ├── malloc/ # Memory allocation
│ ├── thread/ # Threading primitives
│ ├── stdio/ # Standard I/O
│ └── ...
└── ldso/ # Dynamic linker (musl's ld.so)
The android/ directory contains Android-specific adaptations that bridge
differences between musl's upstream behavior and AOSP's requirements.
7.5.3 Enabling Musl for Host Builds¶
Musl is activated through the USE_HOST_MUSL environment variable:
# Enable musl for host tool compilation
export USE_HOST_MUSL=true
m aapt2 # Now compiled against musl instead of glibc
The build system plumbing flows through several layers:
flowchart LR
ENV["USE_HOST_MUSL=true"] --> MK["soong_config.mk<br/>line 92"]
MK --> SOONG["Soong HostMusl<br/>variable.go:263"]
SOONG --> TC["Toolchain selection<br/>linuxMuslX8664"]
TC --> FLAGS["Compiler flags<br/>-DANDROID_HOST_MUSL<br/>-nostdlibinc"]
TC --> LINK["Linker flags<br/>-nostdlib<br/>--sysroot /dev/null"]
TC --> CRT["CRT objects<br/>libc_musl_crtbegin_*"]
// Source: build/soong/android/config.go:2402
func (c *config) UseHostMusl() bool {
return Bool(c.productVariables.HostMusl)
}
7.5.4 Build System Integration¶
When musl is enabled, Soong selects dedicated toolchain factories that override the default glibc-based host compilation:
// Source: build/soong/cc/config/x86_linux_host.go:43
var linuxMuslCflags = []string{
"-DANDROID_HOST_MUSL",
"-nostdlibinc",
"--sysroot /dev/null",
}
// Source: build/soong/cc/config/x86_linux_host.go:66
var linuxMuslLdflags = []string{
"-nostdlib",
"--sysroot /dev/null",
}
The --sysroot /dev/null flag is critical: it prevents the compiler from
finding any system headers or libraries, ensuring complete isolation from the
host's glibc. All headers come from musl's own include/ directory.
Architecture Support¶
Musl supports four host architectures, each with a dedicated LLVM triple:
| Architecture | LLVM Triple | Toolchain Factory |
|---|---|---|
| x86 | i686-linux-musl |
linuxMuslX86ToolchainFactory |
| x86_64 | x86_64-linux-musl |
linuxMuslX8664ToolchainFactory |
| ARM | arm-linux-musleabihf |
linuxMuslArmToolchainFactory |
| ARM64 | aarch64-linux-musl |
linuxMuslArm64ToolchainFactory |
CRT Objects¶
Musl provides its own C runtime startup objects, defined in Android.bp:
// Source: external/musl/Android.bp:460-505
libc_musl_crtbegin_dynamic → Dynamic executable startup
libc_musl_crtbegin_static → Static executable startup
libc_musl_crtbegin_so → Shared library startup
libc_musl_crtend → Executable cleanup
libc_musl_crtend_so → Shared library cleanup
Default Shared Libraries¶
// Source: build/soong/cc/config/x86_linux_host.go:115
var MuslDefaultSharedLibraries = []string{"libc_musl"}
When musl is active, libc_musl replaces glibc as the default system shared
library. All host tools link against it instead.
7.5.5 Prebuilt Musl Toolchain¶
The prebuilt Clang toolchain includes musl runtime libraries for all supported architectures:
prebuilts/clang/host/linux-x86/clang-r563880c/musl/
├── lib/
│ ├── x86_64-unknown-linux-musl/ # x86_64 runtime
│ ├── aarch64-unknown-linux-musl/ # ARM64 runtime
│ ├── arm-unknown-linux-musleabihf/ # ARM runtime
│ ├── i686-unknown-linux-musl/ # x86 runtime
│ └── libc_musl.so # Dynamic musl library
7.5.6 Bionic-Musl Header Sharing¶
Interestingly, musl reuses some headers from Bionic's kernel UAPI layer. The build system generates a musl sysroot that includes Bionic's kernel headers:
// Source: bionic/libc/Android.bp:2703
cc_genrule {
name: "libc_musl_sysroot_bionic_headers",
// Copies bionic's kernel UAPI headers for musl's use
}
This ensures musl and bionic agree on kernel structure definitions (ioctl
numbers, socket options, etc.) since both ultimately target the same Linux
kernel.
7.5.7 Sanitizer Limitations with Musl¶
Not all sanitizers work with musl. The build system disables several:
// Source: build/soong/cc/sanitize.go:677-686
// CFI is disabled for musl
if ctx.toolchain().Musl() {
s.Cfi = nil
}
// ARM64 address and HW address sanitizers are also disabled
Sanitizer runtimes are statically linked with musl (unlike glibc where they
can be dynamically loaded), because musl's dynamic linker has different
semantics for LD_PRELOAD and dlopen.
7.5.8 Bionic vs. Musl vs. Glibc¶
Comparison of AOSP's Three C Libraries¶
graph TB
subgraph Device["Device Target"]
BIONIC["Bionic<br/>Android's custom libc"]
end
subgraph Host["Host Build Machine"]
GLIBC["glibc<br/>Default host libc"]
MUSL["musl<br/>Alternative host libc<br/>USE_HOST_MUSL=true"]
end
APP["Android App<br/>NDK code"] --> BIONIC
SYS["System Services<br/>Native daemons"] --> BIONIC
TOOL1["Host tools<br/>aapt2, dex2oat"] --> GLIBC
TOOL2["Host tools<br/>portable builds"] --> MUSL
style BIONIC fill:#e8f5e9,stroke:#2e7d32
style GLIBC fill:#e3f2fd,stroke:#1565c0
style MUSL fill:#fff3e0,stroke:#e65100
| Aspect | Bionic | glibc | musl |
|---|---|---|---|
| Target | Android device | Linux host (default) | Linux host (opt-in) |
| Static linking | Supported | Problematic (NSS/locale) | Clean, recommended |
| Binary portability | N/A (device only) | Tied to host glibc version | Runs on any Linux |
| Size | Minimal | Large | Minimal |
| POSIX compliance | Partial (intentional) | Full | Nearly full |
| Thread model | pthread (custom) | NPTL | Custom lightweight |
| Activation | Default for device | Default for host | USE_HOST_MUSL=true |
7.5.9 When to Use Musl¶
Musl is primarily useful for:
- CI/CD environments — build servers with varying glibc versions
- Hermetic builds — reproducible builds independent of host system libraries
- Distribution — shipping prebuilt host tools that work across Linux distros
- Cross-compilation — building host tools for ARM build servers (ARM64 musl toolchain)
The AOSP build infrastructure is progressively moving toward musl for host tools to improve build hermeticity and reduce "works on my machine" issues.
7.6 Advanced Topics¶
7.6.1 The soinfo Method Interface¶
The soinfo structure provides a rich method interface for the linker to
operate on loaded libraries. The key methods reveal the lifecycle of a loaded
library:
From bionic/linker/linker_soinfo.h (lines 250-347):
struct soinfo {
// Lifecycle
void call_constructors();
void call_destructors();
void call_pre_init_constructors();
bool prelink_image(bool deterministic_memtag_globals = false);
bool link_image(const SymbolLookupList& lookup_list,
soinfo* local_group_root,
const android_dlextinfo* extinfo,
size_t* relro_fd_offset);
bool protect_relro();
bool protect_16kib_app_compat_code();
// MTE support
void tag_globals(bool deterministic_memtag_globals);
ElfW(Addr) apply_memtag_if_mte_globals(ElfW(Addr) sym_addr) const;
// Symbol lookup
const ElfW(Sym)* find_symbol_by_name(SymbolName& symbol_name,
const version_info* vi) const;
ElfW(Sym)* find_symbol_by_address(const void* addr);
ElfW(Addr) resolve_symbol_address(const ElfW(Sym)* s) const {
if (ELF_ST_TYPE(s->st_info) == STT_GNU_IFUNC) {
return call_ifunc_resolver(s->st_value + load_bias);
}
return static_cast<ElfW(Addr)>(s->st_value + load_bias);
}
// Reference counting
size_t increment_ref_count();
size_t decrement_ref_count();
size_t get_ref_count() const;
// Navigation
soinfo* get_local_group_root() const;
soinfo_list_t& get_children();
soinfo_list_t& get_parents();
android_namespace_t* get_primary_namespace();
android_namespace_list_t& get_secondary_namespaces();
// Version support
const ElfW(Versym)* get_versym(size_t n) const;
ElfW(Addr) get_verneed_ptr() const;
size_t get_verneed_cnt() const;
ElfW(Addr) get_verdef_ptr() const;
size_t get_verdef_cnt() const;
};
The resolve_symbol_address method is particularly noteworthy: for standard
symbols, it simply adds the load bias to the symbol value. But for GNU IFUNC
symbols (STT_GNU_IFUNC), it calls the IFUNC resolver function to determine
the actual implementation address at runtime. This is how architecture-specific
optimizations (like the memcpy variants in Section 6.1.6) are dispatched.
The lifecycle methods are called in a strict order:
graph TD
A["soinfo_alloc()"] --> B["ElfReader::Read()"]
B --> C["ElfReader::Load()"]
C --> D["prelink_image()"]
D --> E["link_image()"]
E --> F["protect_relro()"]
F --> G["call_pre_init_constructors()"]
G --> H["call_constructors()"]
H --> I["Library in use"]
I --> J["call_destructors()"]
J --> K["soinfo_free()"]
style A fill:#e1f5fe
style D fill:#fff3e0
style E fill:#f3e5f5
style H fill:#e8f5e9
style I fill:#c8e6c9
style K fill:#ffcdd2
prelink_image() parses the .dynamic section to fill in the soinfo
fields: symbol table, string table, hash tables, relocation tables, and
init/fini arrays. It does not resolve any symbols.
link_image() processes all relocations, resolving symbol references and patching code and data. After this step, all function pointers and global variable references point to the correct addresses.
protect_relro() marks RELRO (Relocation Read-Only) pages as read-only. RELRO is a security feature: after relocations are applied to the GOT (Global Offset Table), those pages are remapped as read-only to prevent GOT overwrite attacks.
7.6.2 GNU Hash: NEON-Accelerated Symbol Lookup¶
The linker includes a NEON-accelerated GNU hash implementation for ARM architectures:
From bionic/linker/linker_gnu_hash.h (lines 35-54):
#if defined(__arm__) || defined(__aarch64__)
#define USE_GNU_HASH_NEON 1
#else
#define USE_GNU_HASH_NEON 0
#endif
#if USE_GNU_HASH_NEON
#include "arch/arm_neon/linker_gnu_hash_neon.h"
#endif
static std::pair<uint32_t, uint32_t>
calculate_gnu_hash_simple(const char* name) {
uint32_t h = 5381;
const uint8_t* name_bytes =
reinterpret_cast<const uint8_t*>(name);
#pragma unroll 8
while (*name_bytes != 0) {
h += (h << 5) + *name_bytes++; // h*33 + c
}
return { h, reinterpret_cast<const char*>(name_bytes) - name };
}
static inline std::pair<uint32_t, uint32_t>
calculate_gnu_hash(const char* name) {
#if USE_GNU_HASH_NEON
return calculate_gnu_hash_neon(name);
#else
return calculate_gnu_hash_simple(name);
#endif
}
The GNU hash function (h = h * 33 + c, starting from 5381) is the well-known
DJB hash. The simple implementation uses #pragma unroll 8 to hint the
compiler to unroll the loop. On ARM, the NEON implementation processes multiple
bytes in parallel using SIMD instructions, which is measurably faster for long
symbol names.
The function returns both the hash value and the symbol name length. The length
is a byproduct of the hash computation (we scan to the null terminator) and
avoids a redundant strlen() call later in the lookup.
7.6.3 CFI Shadow Architecture¶
The CFI (Control Flow Integrity) shadow is a critical security feature managed by the linker. It provides a lookup table that maps code addresses to CFI validation information.
From bionic/linker/linker_cfi.h (lines 38-49):
// This class keeps the contents of CFI shadow up-to-date with the
// current set of loaded libraries.
// Shadow is mapped and initialized lazily as soon as the first
// CFI-enabled DSO is loaded. It is updated after any library is
// loaded (but before any constructors are ran), and before any
// library is unloaded.
class CFIShadowWriter : private CFIShadow {
uint16_t* MemToShadow(uintptr_t x) {
return reinterpret_cast<uint16_t*>(
*shadow_start + MemToShadowOffset(x));
}
The shadow has the following characteristics:
- Lazy initialization -- Not created until the first CFI-enabled library is loaded, avoiding overhead for processes that do not use CFI.
- 16-bit granularity -- Each shadow entry is a 16-bit value that encodes the validation information for a range of code addresses.
- Update timing -- Updated after library load (before constructors) and before library unload. This ensures that CFI checks during constructors operate on a consistent shadow.
- Integration -- The
__loader_cfi_failfunction indlfcn.cppis called when a CFI check fails, providing a centralized crash handler with diagnostic information.
7.6.4 The Block Allocator¶
The linker uses a custom block allocator for soinfo and related structures instead of malloc. This provides two benefits:
- Deterministic layout -- All soinfo structures are in known pages,
making write-protection possible via
ProtectedDataGuard. - No malloc dependency -- The linker cannot use malloc (which lives in libc.so) during early initialization before libc is loaded.
From bionic/linker/linker.cpp (lines 89-91):
static LinkerTypeAllocator<soinfo> g_soinfo_allocator;
static LinkerTypeAllocator<LinkedListEntry<soinfo>> g_soinfo_links_allocator;
static LinkerTypeAllocator<android_namespace_t> g_namespace_allocator;
static LinkerTypeAllocator<LinkedListEntry<android_namespace_t>>
g_namespace_list_allocator;
The LinkerTypeAllocator allocates objects in page-sized blocks. When a new
object is needed and the current block is full, a new page is mmap'd. The
allocator tracks all pages, enabling protect_all() to iterate over them and
change their protection with mprotect().
From bionic/linker/linker.cpp (lines 484-491):
void ProtectedDataGuard::protect_data(int protection) {
g_soinfo_allocator.protect_all(protection);
g_soinfo_links_allocator.protect_all(protection);
g_namespace_allocator.protect_all(protection);
g_namespace_list_allocator.protect_all(protection);
}
This means that between dlopen/dlclose calls, all linker metadata is
read-only. An attacker who corrupts a soinfo structure (e.g., to redirect
function pointers) will trigger a page fault before the corruption can be
exploited.
7.6.5 Sanitizer Support in the Linker¶
The linker has deep integration with several sanitizers:
ASan (AddressSanitizer):
ASan-instrumented libraries are installed in /data/asan/system/lib64/ (and
similar paths for vendor/odm). The linker prepends these paths when ASan mode
is detected, ensuring that instrumented versions of libraries take priority
over production versions.
HWASan (Hardware AddressSanitizer):
HWASan-instrumented libraries live in hwasan/ subdirectories. The linker
notifies HWASan of library load/unload events via weak callbacks:
From bionic/libc/bionic/libc_init_dynamic.cpp (lines 75-80):
extern "C" __attribute__((weak)) void __hwasan_library_loaded(
ElfW(Addr) base,
const ElfW(Phdr)* phdr,
ElfW(Half) phnum);
extern "C" __attribute__((weak)) void __hwasan_library_unloaded(
ElfW(Addr) base,
const ElfW(Phdr)* phdr,
ElfW(Half) phnum);
These weak symbols are resolved only when HWASan runtime is present, allowing the same linker binary to work with or without HWASan.
MTE (Memory Tagging Extension):
MTE support is integrated at multiple levels:
- Stack tagging -- The linker calls
__libc_init_mte_stack()after loading all libraries that request stack tagging via their.dynamicsection. - Heap tagging -- Enabled via ELF notes (
note_memtag_heap_async.S/note_memtag_heap_sync.S). - Global tagging -- The linker's
tag_globals()method applies MTE tags to global variables in libraries that opt in.
7.6.6 The Complete Process Startup Sequence¶
Combining all the components from this chapter, here is the complete sequence
from exec() to main() for a dynamically-linked Android application:
sequenceDiagram
participant K as Kernel
participant L as Linker
participant LC as libc.so
participant A as Application
Note over K: exec() system call
K->>K: Parse ELF headers
K->>K: Map PT_LOAD segments
K->>K: Read PT_INTERP -> /system/bin/linker64
K->>K: Map linker into process
K->>K: Set up auxiliary vector
K->>K: Set up process stack
K->>L: Transfer to linker _start
Note over L: Phase 1: Self-bootstrap
L->>L: __linker_init()
L->>L: Self-relocate (no external deps)
L->>L: Set up linker soinfo
Note over L: Phase 2: Environment
L->>L: Sanitize AT_SECURE env vars
L->>L: __system_properties_init()
L->>L: platform_properties_init() [BTI check]
L->>L: linker_debuggerd_init()
L->>L: Parse LD_DEBUG, LD_LIBRARY_PATH, LD_PRELOAD
Note over L: Phase 3: Executable setup
L->>L: get_executable_info() or load_executable()
L->>L: Create somain soinfo
L->>L: PIE validation (ET_DYN required)
L->>L: init_default_namespaces()
Note over L: Phase 4: Dependency resolution
L->>L: somain->prelink_image()
L->>L: Collect DT_NEEDED + LD_PRELOAD names
L->>L: find_libraries() [BFS dependency walk]
loop For each dependency
L->>L: find_library_internal()
L->>L: Search namespace paths
L->>L: ElfReader::Read() + Load()
L->>L: Create soinfo
L->>L: Collect transitive DT_NEEDED
end
Note over L: Phase 5: Linking
loop For each loaded library
L->>L: prelink_image()
L->>L: link_image() [relocations]
L->>L: protect_relro()
end
Note over L: Phase 6: VDSO
L->>L: add_vdso()
L->>L: Link VDSO as [vdso] soinfo
Note over L: Phase 7: MTE & TLS
L->>L: __libc_init_mte() [AArch64]
L->>L: __libc_init_mte_stack() [AArch64]
L->>L: linker_finalize_static_tls()
L->>L: __libc_init_main_thread_final()
Note over L: Phase 8: CFI
L->>L: CFIShadow::InitialLinkDone()
Note over L: Phase 9: Initialization
L->>LC: Call __libc_preinit() [.preinit_array]
LC->>LC: Init TLS, globals, properties
LC->>LC: Init Scudo allocator
LC->>LC: Init netd client
L->>L: somain->call_pre_init_constructors()
loop For each library (dependency order)
L->>L: si->call_constructors()
end
Note over L: Phase 10: Handoff
L->>L: purge_unused_memory()
L->>A: Jump to AT_ENTRY (executable entry point)
A->>LC: __libc_init() -> main(argc, argv, envp)
This sequence illustrates why the linker is one of the most performance-sensitive components in Android. Every microsecond spent in the linker is multiplied by every process start. The linker's careful optimization -- symbol caching, template-specialized relocation loops, NEON-accelerated hashing, protected-data guards -- all serve to minimize this startup overhead.
7.6.7 Error Messages and Diagnostics¶
The linker provides detailed error messages when linking fails. Understanding these messages is essential for debugging native library issues:
| Error Message | Cause | Solution |
|---|---|---|
"libfoo.so" not found |
Library not on any search path | Check namespace paths, APK lib directory |
cannot locate symbol "bar" referenced by "libfoo.so" |
Unresolved strong symbol | Check library dependencies, symbol visibility |
"libfoo.so" is not accessible for the namespace "default" |
Namespace isolation | Check linkerconfig, uses-native-library manifest |
"libfoo.so" is 32-bit instead of 64-bit |
ABI mismatch | Build library for correct architecture |
"libfoo.so" has bad ELF magic |
Corrupted or non-ELF file | Verify file integrity |
Android only supports position-independent executables |
Non-PIE executable | Rebuild with -fPIE -pie |
has load segments that are both writable and executable |
W+E segment (API >= 26) | Fix linker script, use separate segments |
program alignment cannot be smaller than system page size |
4KiB library on 16KiB system | Rebuild with 16KiB alignment or enable compat |
Each error message is carefully crafted to include the library name, the namespace context, and (where applicable) a reference to the Android bug tracker entry that motivated the error or exception.
7.6.8 Performance Considerations¶
The linker's performance directly affects app startup time and system boot time. Key performance characteristics:
Relocation processing:
- The template-specialized
process_relocation_impl<Mode>generates three separate code paths, eliminating branch overhead for the common cases. - The symbol cache reduces redundant hash table lookups by 80%+ in typical workloads.
- The
__predict_falseand__predict_truehints guide the compiler's branch prediction optimizations.
ELF loading:
- The
ElfReaderusesMappedFileFragmentfor zero-copy reading of headers (mmap instead of read). - Segment mapping uses
MAP_FIXED | MAP_PRIVATE, which tells the kernel to replace the existing PROT_NONE mapping without creating a new VMA. - Transparent huge pages (
MADV_HUGEPAGE) reduce TLB pressure for large executable segments.
Memory management:
- The block allocator avoids the overhead of malloc/free for linker-internal structures.
purge_unused_memory()is called before handing control to the application, returning any internal buffers that are no longer needed.- RELRO protection prevents accidental writes to resolved GOT entries, improving cache behavior (read-only pages can be shared between processes).
Startup timing:
The linker records and reports its total execution time when LD_DEBUG=timing:
Typical values range from 5ms for simple executables to 50ms+ for applications with many native dependencies. The Android team continuously optimizes this path, as it directly affects the user-perceived app launch latency.
Key Source Files Reference¶
| File | Path | Purpose |
|---|---|---|
| SYSCALLS.TXT | bionic/libc/SYSCALLS.TXT |
System call definitions |
| gensyscalls.py | bionic/libc/tools/gensyscalls.py |
Stub generator |
| SECCOMP_BLOCKLIST_APP.TXT | bionic/libc/SECCOMP_BLOCKLIST_APP.TXT |
Blocked syscalls for apps |
| SECCOMP_ALLOWLIST_APP.TXT | bionic/libc/SECCOMP_ALLOWLIST_APP.TXT |
Extra allowed syscalls for apps |
| SECCOMP_ALLOWLIST_COMMON.TXT | bionic/libc/SECCOMP_ALLOWLIST_COMMON.TXT |
Extra allowed syscalls for all |
| SECCOMP_BLOCKLIST_COMMON.TXT | bionic/libc/SECCOMP_BLOCKLIST_COMMON.TXT |
Common blocked syscalls |
| SECCOMP_PRIORITY.TXT | bionic/libc/SECCOMP_PRIORITY.TXT |
Hot-path syscalls |
| seccomp_policy.cpp | bionic/libc/seccomp/seccomp_policy.cpp |
BPF filter generation |
| syscall.S (arm64) | bionic/libc/arch-arm64/bionic/syscall.S |
AArch64 syscall entry |
| ifuncs.cpp (arm64) | bionic/libc/arch-arm64/ifuncs.cpp |
IFUNC resolvers |
| libc_init_dynamic.cpp | bionic/libc/bionic/libc_init_dynamic.cpp |
Dynamic init |
| libc_init_common.cpp | bionic/libc/bionic/libc_init_common.cpp |
Common init |
| malloc_common.cpp | bionic/libc/bionic/malloc_common.cpp |
Allocator dispatch |
| pthread_create.cpp | bionic/libc/bionic/pthread_create.cpp |
Thread creation |
| linker.cpp | bionic/linker/linker.cpp |
Core linker logic |
| linker_main.cpp | bionic/linker/linker_main.cpp |
Linker entry and main sequence |
| linker_phdr.cpp | bionic/linker/linker_phdr.cpp |
ELF loading |
| linker_relocate.cpp | bionic/linker/linker_relocate.cpp |
Relocation processing |
| linker_namespaces.h | bionic/linker/linker_namespaces.h |
Namespace structures |
| linker_soinfo.h | bionic/linker/linker_soinfo.h |
soinfo definition |
| linker_config.cpp | bionic/linker/linker_config.cpp |
Config file parser |
| dlfcn.cpp | bionic/linker/dlfcn.cpp |
dlopen/dlsym API |
| vndk.go | build/soong/cc/vndk.go |
VNDK build definitions |
| main.cc | system/linkerconfig/main.cc |
Linkerconfig entry point |
| systemdefault.cc | system/linkerconfig/contents/namespace/systemdefault.cc |
System namespace |
| vendordefault.cc | system/linkerconfig/contents/namespace/vendordefault.cc |
Vendor namespace |
| vndk.cc | system/linkerconfig/contents/namespace/vndk.cc |
VNDK namespace |
| system_links.cc | system/linkerconfig/contents/common/system_links.cc |
Bionic lib links |