Chapter 13: Graphics and Render Pipeline¶

Android's graphics stack is one of the most intricate subsystems in AOSP. It spans from the Java View.draw() call in an application's UI thread all the way down through native C++ rendering libraries, GPU shader compilation, hardware-accelerated composition, and finally to photons leaving the physical display panel. This chapter traces that entire journey through the actual AOSP source code, revealing the architecture, data structures, synchronization mechanisms, and design decisions that make 60+ FPS rendering possible on billions of devices.

13.1 Graphics Stack Overview¶

13.1.1 The Full Pipeline at a Glance¶

Every frame that appears on an Android screen follows a deterministic path through multiple subsystems. Understanding this path is essential for performance analysis, driver debugging, and framework development.

graph TD
    A["Application<br/>View.draw()"] --> B["HWUI<br/>RecordingCanvas"]
    B --> C["DisplayList<br/>(SkiaDisplayList)"]
    C --> D["RenderThread<br/>DrawFrameTask"]
    D --> E["SkiaPipeline<br/>(GL or Vulkan)"]
    E --> F["Skia<br/>(Ganesh GPU Backend)"]
    F --> G{"GPU API"}
    G -->|OpenGL ES| H["EGL / GLES<br/>Driver"]
    G -->|Vulkan| I["Vulkan<br/>Driver"]
    H --> J["GPU Hardware"]
    I --> J
    J --> K["BufferQueue"]
    K --> L["SurfaceFlinger"]
    L --> M["RenderEngine<br/>(Skia-based)"]
    M --> N["Hardware Composer<br/>(HWC)"]
    N --> O["Display Panel"]

    style A fill:#4CAF50,color:#fff
    style D fill:#2196F3,color:#fff
    style F fill:#FF9800,color:#fff
    style L fill:#9C27B0,color:#fff
    style N fill:#F44336,color:#fff

13.1.2 Thread Architecture¶

Android's rendering architecture is fundamentally multi-threaded. Each application window has at least two threads involved in rendering:

sequenceDiagram
    participant UI as UI Thread
    participant RT as RenderThread
    participant SF as SurfaceFlinger
    participant HWC as HWC HAL

    UI->>UI: View.invalidate()
    UI->>UI: Choreographer VSYNC
    UI->>UI: ViewRootImpl.performTraversals()
    UI->>UI: View.draw() → RecordingCanvas
    UI->>RT: DrawFrameTask.drawFrame()
    Note over UI,RT: UI thread blocks on sync

    RT->>RT: syncFrameState()
    RT-->>UI: Unblock UI thread
    RT->>RT: CanvasContext.draw()
    RT->>RT: SkiaPipeline.renderFrame()
    RT->>RT: Skia → GPU commands
    RT->>SF: eglSwapBuffers / vkQueuePresent

    SF->>SF: Acquire buffer
    SF->>SF: RenderEngine composition
    SF->>HWC: setLayerBuffer()
    HWC->>HWC: Hardware compose
    HWC-->>SF: presentDisplay()

13.1.3 Key Source Directories¶

The graphics stack spans multiple top-level directories in AOSP:

Directory	Purpose	Key Files
`frameworks/native/opengl/`	EGL/GLES loader and wrappers	`libs/EGL/eglApi.cpp`, `libs/EGL/egl.cpp`
`frameworks/native/vulkan/`	Vulkan loader	`libvulkan/driver.cpp`, `libvulkan/api.cpp`
`frameworks/base/libs/hwui/`	Hardware UI renderer	`RenderNode.h`, `renderthread/`
`external/skia/`	2D rendering engine	`src/gpu/ganesh/`, `include/core/`
`frameworks/native/services/surfaceflinger/`	System compositor	`SurfaceFlinger.cpp`
`hardware/interfaces/graphics/`	HAL interfaces	`composer/`, `allocator/`
`external/angle/`	GL-on-Vulkan translation	`src/libGLESv2/`, `src/libEGL/`

13.1.4 Pipeline Selection¶

HWUI supports two rendering backends, selected at boot time via system properties:

# Source: frameworks/base/libs/hwui/Properties.h
# Property: debug.hwui.renderer
#   "skiavk" → SkiaVulkan pipeline
#   "skiagl" → SkiaGL pipeline

As seen in RenderThread.cpp (line 286):

// frameworks/base/libs/hwui/renderthread/RenderThread.cpp, line 286
static const char* pipelineToString() {
    switch (auto renderType = Properties::getRenderPipelineType()) {
        case RenderPipelineType::SkiaGL:
            return "Skia (OpenGL)";
        case RenderPipelineType::SkiaVulkan:
            return "Skia (Vulkan)";
        default:
            LOG_ALWAYS_FATAL("canvas context type %d not supported",
                             (int32_t)renderType);
    }
}

The CanvasContext::create() factory in CanvasContext.cpp (line 82) instantiates the correct pipeline:

// frameworks/base/libs/hwui/renderthread/CanvasContext.cpp, line 82
CanvasContext* CanvasContext::create(RenderThread& thread, bool translucent,
                                     RenderNode* rootRenderNode,
                                     IContextFactory* contextFactory,
                                     pid_t uiThreadId, pid_t renderThreadId) {
    auto renderType = Properties::getRenderPipelineType();
    switch (renderType) {
        case RenderPipelineType::SkiaGL:
            return new CanvasContext(thread, translucent, rootRenderNode,
                contextFactory,
                std::make_unique<skiapipeline::SkiaOpenGLPipeline>(thread),
                uiThreadId, renderThreadId);
        case RenderPipelineType::SkiaVulkan:
            return new CanvasContext(thread, translucent, rootRenderNode,
                contextFactory,
                std::make_unique<skiapipeline::SkiaVulkanPipeline>(thread),
                uiThreadId, renderThreadId);
    }
}

13.2 OpenGL ES¶

13.2.1 Architecture of the EGL/GLES Loader¶

Android's OpenGL ES implementation is a loader-layer architecture. Applications never link directly against GPU vendor drivers. Instead, they link against libEGL.so and libGLESv2.so, which are thin dispatch libraries maintained in frameworks/native/opengl/.

graph LR
    A["Application"] --> B["libEGL.so<br/>(EGL Wrapper)"]
    A --> C["libGLESv2.so<br/>(GLES Wrapper)"]
    B --> D["EGL Layers<br/>(Optional)"]
    D --> E["Vendor EGL<br/>Driver"]
    C --> F["GL Hooks<br/>(TLS dispatch)"]
    F --> G["Vendor GLES<br/>Driver"]
    B -.->|ANGLE| H["libEGL_angle.so"]
    C -.->|ANGLE| I["libGLESv2_angle.so"]

    style B fill:#2196F3,color:#fff
    style C fill:#2196F3,color:#fff
    style E fill:#FF9800,color:#fff
    style G fill:#FF9800,color:#fff

13.2.2 The EGL Connection: `egl_connection_t`¶

The central data structure is egl_connection_t, declared in egldefs.h. It holds function pointers for both EGL and GLES calls:

// frameworks/native/opengl/libs/EGL/egldefs.h
struct egl_connection_t {
    // function tables for EGL platform calls
    platform_impl_t platform;
    // function tables for GL calls - one per GLES version
    gl_hooks_t* hooks[2];
    // handle to the loaded driver shared object
    void* dso;
};

The global singleton gEGLImpl is declared in egl.cpp (line 33):

// frameworks/native/opengl/libs/EGL/egl.cpp, line 33
egl_connection_t gEGLImpl;
gl_hooks_t gHooks[2];
gl_hooks_t gHooksNoContext;

13.2.3 Driver Initialization¶

Driver loading is triggered lazily on the first EGL call. The function egl_init_drivers() in egl.cpp (line 155) is the entry point:

// frameworks/native/opengl/libs/EGL/egl.cpp, line 125
static EGLBoolean egl_init_drivers_locked() {
    // ...
    Loader& loader(Loader::getInstance());
    egl_connection_t* cnx = &gEGLImpl;
    cnx->hooks[egl_connection_t::GLESv1_INDEX] =
        &gHooks[egl_connection_t::GLESv1_INDEX];
    cnx->hooks[egl_connection_t::GLESv2_INDEX] =
        &gHooks[egl_connection_t::GLESv2_INDEX];
    cnx->dso = loader.open(cnx);

    // Check for layers after driver load
    if (cnx->dso) {
        LayerLoader& layer_loader(LayerLoader::getInstance());
        layer_loader.InitLayers(cnx);
    }
    return cnx->dso ? EGL_TRUE : EGL_FALSE;
}

The Loader::open() method (in Loader.cpp) performs the actual dlopen() of the vendor driver. It searches for drivers using these naming conventions:

Updated driver from GraphicsEnv namespace (Game driver / updatable driver)
Built-in vendor driver: libEGL_<name>.so, libGLESv2_<name>.so
ANGLE (if selected by the system): libEGL_angle.so

13.2.4 EGL API Dispatch¶

Every public EGL function in eglApi.cpp follows an identical pattern: clear the thread-local error, obtain the global connection, and dispatch through the platform function table:

// frameworks/native/opengl/libs/EGL/eglApi.cpp, line 40
EGLDisplay eglGetDisplay(EGLNativeDisplayType display) {
    ATRACE_CALL();
    if (egl_init_drivers() == EGL_FALSE) {
        return setError(EGL_BAD_PARAMETER, EGL_NO_DISPLAY);
    }
    clearError();
    egl_connection_t* const cnx = &gEGLImpl;
    return cnx->platform.eglGetDisplay(display);
}

This pattern repeats for all 660 lines of eglApi.cpp. The platform table can point either directly to the vendor driver or through optional EGL layers (used for debugging, validation, or ANGLE interposition).

13.2.5 GLES Function Dispatch via TLS¶

OpenGL ES functions use a different dispatch mechanism -- Thread-Local Storage (TLS). When eglMakeCurrent() binds a context, it sets the TLS hooks to point at the correct driver:

// frameworks/native/opengl/libs/EGL/egl.cpp, line 186
void setGlThreadSpecific(gl_hooks_t const* value) {
    gl_hooks_t const* volatile* tls_hooks = get_tls_hooks();
    tls_hooks[TLS_SLOT_OPENGL_API] = value;
}

Each GLES function (e.g., glDrawArrays) is a tiny trampoline that reads the current hooks from TLS and jumps to the driver implementation. This is generated at build time from entries.in and entries_gles1.in files.

When no context is current, the hooks point at gl_no_context() (line 42), which logs an error:

// frameworks/native/opengl/libs/EGL/egl.cpp, line 42
static int gl_no_context() {
    if (egl_tls_t::logNoContextCall()) {
        const char* const error = "call to OpenGL ES API with "
                                  "no current context (logged once per thread)";
        // ...
    }
    return 0;
}

13.2.6 EGL Layers¶

AOSP supports intercepting EGL/GLES calls through a layer mechanism, similar to Vulkan layers. The LayerLoader class scans for layers based on:

debug.gles.layers system property
Application metadata in GraphicsEnv
Settings from the GPU debug app

Layers are loaded as shared libraries that implement the eglGetProcAddress-based interception pattern.

13.2.7 Built-in Extensions¶

The EGL wrapper exposes a set of built-in extensions that are implemented in the wrapper itself, independent of the vendor driver. From egl_platform_entries.cpp (line 86):

// frameworks/native/opengl/libs/EGL/egl_platform_entries.cpp, line 86
const char* const gBuiltinExtensionString =
    "EGL_ANDROID_front_buffer_auto_refresh "
    "EGL_ANDROID_get_native_client_buffer "
    "EGL_ANDROID_presentation_time "
    "EGL_EXT_surface_CTA861_3_metadata "
    "EGL_EXT_surface_SMPTE2086_metadata "
    "EGL_KHR_get_all_proc_addresses "
    "EGL_KHR_swap_buffers_with_damage "
    ;

Android-specific extensions like EGL_ANDROID_native_fence_sync and EGL_ANDROID_presentation_time are critical for frame timing and synchronization with SurfaceFlinger.

13.2.8 The MultifileBlobCache¶

Shader compilation is expensive. AOSP implements a persistent shader cache via MultifileBlobCache (in frameworks/native/opengl/libs/EGL/MultifileBlobCache.cpp, 1,097 lines). This cache:

Stores compiled shader binaries on disk across app launches
Uses a multi-file layout (one file per cache entry) for robustness
Implements LRU eviction when the cache exceeds size limits
Employs a background worker thread for deferred disk writes
Validates entries using CRC checksums

The key data structures from MultifileBlobCache.h:

// frameworks/native/opengl/libs/EGL/MultifileBlobCache.h, line 44
struct MultifileHeader {
    uint32_t magic;
    uint32_t crc;
    EGLsizeiANDROID keySize;
    EGLsizeiANDROID valueSize;
};

The cache also maintains a "hot cache" -- a memory-mapped subset of recently-used entries for fast access without disk I/O:

// frameworks/native/opengl/libs/EGL/MultifileBlobCache.h, line 64
struct MultifileHotCache {
    int entryFd;
    uint8_t* entryBuffer;
    size_t entrySize;
};

13.2.9 Java Bindings¶

The Java-side OpenGL ES APIs (android.opengl.GLES20, GLES30, etc.) are generated by frameworks/native/opengl/tools/glgen/. This code generator reads the OpenGL ES specification XML and produces both the Java classes and JNI stub C++ files. The generated stubs call through to the native GLES functions, which in turn dispatch via the TLS hooks.

graph TD
    A["Java: GLES30.glDrawArrays()"] --> B["JNI: android_opengl_GLES30.cpp"]
    B --> C["Native: glDrawArrays()"]
    C --> D["TLS Hook Dispatch"]
    D --> E["Vendor GLES Driver"]

    style A fill:#4CAF50,color:#fff
    style E fill:#FF9800,color:#fff

13.2.10 EGL Object Lifecycle¶

The EGL wrapper maintains reference-counted wrappers around driver EGL objects. This prevents use-after-free bugs when applications misbehave:

graph TD
    A["App calls<br/>eglCreateContext()"] --> B["egl_context_t created<br/>(ref count = 1)"]
    B --> C["eglMakeCurrent()<br/>(ref count = 2)"]
    C --> D["App calls<br/>eglDestroyContext()"]
    D --> E["Marks for deletion<br/>(ref count = 1)"]
    E --> F["eglMakeCurrent(NONE)<br/>(ref count = 0)"]
    F --> G["Actually destroyed"]

    style B fill:#4CAF50,color:#fff
    style G fill:#F44336,color:#fff

The egl_object_t base class in egl_object.h provides this reference counting:

egl_display_t -- wraps EGLDisplay
egl_context_t -- wraps EGLContext, tracks GL extensions
egl_surface_t -- wraps EGLSurface

13.2.11 Thread-Local Error Handling¶

Each thread maintains its own EGL error state via egl_tls_t:

// frameworks/native/opengl/libs/EGL/egl_tls.cpp
// Thread-local storage for:
// - Current EGL error code
// - Current EGL context
// - "no context call" logging flag

The clearError() call at the start of each EGL function resets the per-thread error to EGL_SUCCESS, and any subsequent error overwrites it. This follows the EGL specification requirement that eglGetError() returns the most recent error.

13.2.12 EGL Initialization Sequence¶

The complete EGL initialization flow on Android:

sequenceDiagram
    participant App as Application
    participant EGL as libEGL.so
    participant Loader as Loader
    participant Driver as Vendor Driver

    App->>EGL: eglGetDisplay()
    EGL->>EGL: egl_init_drivers()
    EGL->>EGL: pthread_once(early_egl_init)
    Note over EGL: Fill gHooksNoContext<br/>with gl_no_context stubs
    EGL->>Loader: Loader::getInstance()
    EGL->>Loader: loader.open(cnx)
    Loader->>Loader: Determine driver path
    Loader->>Driver: dlopen("libEGL_<name>.so")
    Loader->>Driver: dlopen("libGLESv2_<name>.so")
    Loader->>Driver: Resolve all function pointers
    Loader-->>EGL: Driver loaded
    EGL->>EGL: LayerLoader.InitLayers(cnx)
    EGL-->>App: EGLDisplay handle

    App->>EGL: eglInitialize()
    EGL->>Driver: driver.eglInitialize()
    Driver-->>EGL: EGL version
    EGL-->>App: Major, Minor version

    App->>EGL: eglChooseConfig()
    EGL->>Driver: driver.eglChooseConfig()
    Driver-->>EGL: Matching configs
    EGL-->>App: Config list

    App->>EGL: eglCreateContext()
    EGL->>Driver: driver.eglCreateContext()
    Driver-->>EGL: GL context handle
    EGL->>EGL: Create egl_context_t wrapper
    EGL-->>App: EGLContext handle

    App->>EGL: eglMakeCurrent()
    EGL->>Driver: driver.eglMakeCurrent()
    EGL->>EGL: setGlThreadSpecific(driver hooks)
    Note over EGL: GL calls now dispatch<br/>to vendor driver

13.2.13 Extension String Management¶

The EGL wrapper manages two sets of extensions:

Built-in extensions: Implemented in the wrapper itself (always available)
Driver extensions: Passed through from the vendor driver (availability varies)

The combined extension string is returned to applications via eglQueryString(). Android adds several proprietary extensions:

Extension	Purpose
`EGL_ANDROID_native_fence_sync`	GPU↔CPU fence synchronization
`EGL_ANDROID_presentation_time`	Frame presentation timestamps
`EGL_ANDROID_front_buffer_auto_refresh`	Direct front-buffer rendering
`EGL_ANDROID_get_frame_timestamps`	Per-frame timing data
`EGL_ANDROID_get_native_client_buffer`	AHardwareBuffer↔EGLClientBuffer
`EGL_KHR_swap_buffers_with_damage`	Partial screen update

13.2.14 BlobCache: The Single-File Cache¶

Before the MultifileBlobCache, Android used a simpler BlobCache (and FileBlobCache) implementation. These are still present in the codebase:

BlobCache.cpp -- In-memory key-value cache with LRU eviction
FileBlobCache.cpp -- Extends BlobCache with file-backed persistence
egl_cache.cpp -- Integrates the blob cache with the EGL driver's cache callbacks

The egl_cache registers callbacks with the driver via EGL_ANDROID_blob_cache extension, allowing the driver to store and retrieve compiled shaders through the AOSP cache infrastructure.

graph TD
    A["GPU Driver"] -->|"set(key, value)"| B["egl_cache"]
    B --> C["MultifileBlobCache"]
    C --> D["Disk Storage"]

    E["GPU Driver"] -->|"get(key)"| B
    B --> C
    C -->|"cached value"| E

    style A fill:#FF9800,color:#fff
    style C fill:#2196F3,color:#fff

13.3 Vulkan¶

13.3.1 The Vulkan Loader Architecture¶

Android's Vulkan loader lives in frameworks/native/vulkan/libvulkan/. Unlike EGL, Vulkan was designed from the ground up with a loader-layer-ICD architecture. The Android loader is relatively thin because Vulkan's explicit API design reduces the loader's responsibilities.

graph TD
    A["Application"] --> B["libvulkan.so<br/>(AOSP Loader)"]
    B --> C["API Layer<br/>(api.cpp)"]
    C --> D["Validation Layers<br/>(Optional)"]
    D --> E["Driver Layer<br/>(driver.cpp)"]
    E --> F["Vendor Vulkan HAL<br/>(vulkan.{name}.so)"]
    F --> G["GPU Hardware"]

    subgraph "Android Additions"
        H["Swapchain<br/>(swapchain.cpp)"]
        I["VkSurfaceKHR<br/>↔ ANativeWindow"]
    end

    C --> H
    H --> I
    I --> E

    style B fill:#2196F3,color:#fff
    style F fill:#FF9800,color:#fff

13.3.2 Driver Loading (`driver.cpp`)¶

The Vulkan HAL is loaded by the Hal class in driver.cpp. The loading sequence tries multiple sources in priority order:

// frameworks/native/vulkan/libvulkan/driver.cpp, line 249
bool Hal::Open() {
    ATRACE_CALL();
    const nsecs_t openTime = systemTime();

    if (hal_.ShouldUnloadBuiltinDriver()) {
        hal_.UnloadBuiltinDriver();
    }
    if (hal_.dev_) return true;

    // Use a stub device unless we successfully open a real HAL device.
    hal_.dev_ = &stubhal::kDevice;

    int result;
    const hwvulkan_module_t* module = nullptr;

    result = LoadUpdatedDriver(&module);      // 1. Game/updated driver
    if (result == -ENOENT) {
        result = LoadDriverFromApex(&module); // 2. Vulkan APEX
    }
    if (result == -ENOENT) {
        result = LoadBuiltinDriver(&module);  // 3. Built-in vendor driver
    }
    // ...
}

The LoadDriver() function (line 157) searches for the vendor HAL using system properties:

// frameworks/native/vulkan/libvulkan/driver.cpp, line 145
const std::array<const char*, 2> HAL_SUBNAME_KEY_PROPERTIES = {{
    "ro.hardware.vulkan",
    "ro.board.platform",
}};

This resolves to loading a shared library named vulkan.<property_value>.so from the vendor partition.

13.3.3 Driver Loading from APEX¶

Android supports loading Vulkan drivers from APEX modules, enabling driver updates outside of full OTA updates:

// frameworks/native/vulkan/libvulkan/driver.cpp, line 206
int LoadDriverFromApex(const hwvulkan_module_t** module) {
    auto apex_name = android::base::GetProperty(
        RO_VULKAN_APEX_PROPERTY, "");
    if (apex_name == "") return -ENOENT;
    std::replace(apex_name.begin(), apex_name.end(), '.', '_');
    auto ns = android_get_exported_namespace(apex_name.c_str());
    if (!ns) return -ENOENT;
    // ...
    return LoadDriver(ns, apex_name.c_str(), module);
}

13.3.4 Instance and Device Creation (`api.cpp`)¶

The API layer in api.cpp handles instance/device creation, layer discovery, and function dispatch. The OverrideLayerNames class (line 59) manages implicit Vulkan layer injection:

// frameworks/native/vulkan/libvulkan/api.cpp, line 59
class OverrideLayerNames {
public:
    OverrideLayerNames(bool is_instance,
                       const VkAllocationCallbacks& allocator)
        : is_instance_(is_instance), allocator_(allocator),
          scope_(VK_SYSTEM_ALLOCATION_SCOPE_COMMAND),
          names_(nullptr), name_count_(0), implicit_layers_() {
        implicit_layers_.result = VK_SUCCESS;
    }
    // ...
};

Layers can be injected via:

GraphicsEnv::getDebugLayers() -- from Android Settings UI or developer options
debug.vulkan.layers system property -- colon-separated layer list
debug.vulkan.layer.<N> properties -- individual layer selection by priority

13.3.5 The `CreateInfoWrapper` Class¶

The CreateInfoWrapper in driver.cpp (line 82) is a critical piece of infrastructure that sanitizes VkInstanceCreateInfo and VkDeviceCreateInfo structures. It performs:

API version validation between the app request and the ICD capability
Extension filtering (removing extensions the ICD doesn't support)
pNext chain sanitization (removing unrecognized structures)
Layer name resolution

// frameworks/native/vulkan/libvulkan/driver.cpp, line 82
class CreateInfoWrapper {
public:
    CreateInfoWrapper(const VkInstanceCreateInfo& create_info,
                      uint32_t icd_api_version,
                      const VkAllocationCallbacks& allocator);
    CreateInfoWrapper(VkPhysicalDevice physical_dev,
                      const VkDeviceCreateInfo& create_info,
                      uint32_t icd_api_version,
                      const VkAllocationCallbacks& allocator);

    VkResult Validate();
    const std::bitset<ProcHook::EXTENSION_COUNT>&
        GetHookExtensions() const;
    const std::bitset<ProcHook::EXTENSION_COUNT>&
        GetHalExtensions() const;
    // ...
};

13.3.6 The Swapchain: Vulkan Meets Android Surfaces¶

swapchain.cpp is one of the most important files in the Vulkan loader. It implements VK_KHR_swapchain by bridging Vulkan's presentation model with Android's ANativeWindow / BufferQueue system.

Key operations:

Surface transform translation -- Android's native window transforms and Vulkan's surface transforms are isomorphic but encoded differently:

// frameworks/native/vulkan/libvulkan/swapchain.cpp, line 82
VkSurfaceTransformFlagBitsKHR TranslateNativeToVulkanTransform(
    int native) {
    switch (native) {
        case 0:
            return VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR;
        case NATIVE_WINDOW_TRANSFORM_FLIP_H:
            return VK_SURFACE_TRANSFORM_HORIZONTAL_MIRROR_BIT_KHR;
        case NATIVE_WINDOW_TRANSFORM_ROT_90:
            return VK_SURFACE_TRANSFORM_ROTATE_90_BIT_KHR;
        // ...
    }
}

Color space support -- The swapchain maps Vulkan color spaces to Android data spaces:

// frameworks/native/vulkan/libvulkan/swapchain.cpp, line 162
const static VkColorSpaceKHR
    colorSpaceSupportedByVkEXTSwapchainColorspace[] = {
    VK_COLOR_SPACE_DISPLAY_P3_NONLINEAR_EXT,
    VK_COLOR_SPACE_DISPLAY_P3_LINEAR_EXT,
    VK_COLOR_SPACE_DCI_P3_NONLINEAR_EXT,
    VK_COLOR_SPACE_BT709_LINEAR_EXT,
    VK_COLOR_SPACE_BT709_NONLINEAR_EXT,
    VK_COLOR_SPACE_BT2020_LINEAR_EXT,
    VK_COLOR_SPACE_HDR10_ST2084_EXT,
    // ...
};

Presentation timing -- The TimingInfo class (line 181) tracks per-frame timing data for VK_GOOGLE_display_timing:

// frameworks/native/vulkan/libvulkan/swapchain.cpp, line 181
class TimingInfo {
public:
    TimingInfo(const VkPresentTimeGOOGLE* qp, uint64_t nativeFrameId)
        : vals_{qp->presentID, qp->desiredPresentTime, 0, 0, 0},
          native_frame_id_(nativeFrameId) {}
    bool ready() const { /* check all timestamps resolved */ }
    void calculate(int64_t rdur) { /* compute actual timings */ }
};

graph LR
    A["VkSwapchainKHR"] --> B["ANativeWindow"]
    B --> C["BufferQueue"]
    C --> D["dequeueBuffer()"]
    D --> E["VkImage<br/>(backed by<br/>AHardwareBuffer)"]
    E --> F["App renders"]
    F --> G["queueBuffer()"]
    G --> H["SurfaceFlinger<br/>acquires buffer"]

    style A fill:#2196F3,color:#fff
    style C fill:#FF9800,color:#fff
    style H fill:#9C27B0,color:#fff

13.3.7 Vulkan Profiles¶

frameworks/native/vulkan/vkprofiles/ defines Android Baseline Profiles (ABP) that specify minimum Vulkan feature sets for Android API levels. These profiles are used by CTS and by applications to query guaranteed capabilities.

13.3.8 The Null Driver¶

For testing and development, frameworks/native/vulkan/nulldrv/ provides a null Vulkan driver implementation. null_driver.cpp and null_driver_gen.cpp implement the full Vulkan API surface but perform no actual GPU operations. This is invaluable for:

Running CTS tests on emulators without GPU support
Testing the loader/layer infrastructure in isolation
Verifying application Vulkan usage patterns

13.3.9 Code Generation¶

Much of the Vulkan loader is generated from the Vulkan specification XML. The files api_gen.cpp, driver_gen.cpp, and null_driver_gen.cpp are auto-generated, providing:

Dispatch tables for all Vulkan entry points
ProcHook tables for extension-dependent functions
Stub implementations for the null driver

13.3.10 The Dispatch Table Architecture¶

Vulkan uses a two-level dispatch table system:

graph TD
    A["vkCreateBuffer()"] --> B["Instance Dispatch<br/>(api_gen.cpp)"]
    B --> C{"Layer<br/>present?"}
    C -->|Yes| D["Layer intercept"]
    D --> E["Driver Dispatch<br/>(driver_gen.cpp)"]
    C -->|No| E
    E --> F["Vendor ICD"]

    style B fill:#2196F3,color:#fff
    style D fill:#FF9800,color:#fff
    style F fill:#4CAF50,color:#fff

The instance dispatch table is indexed by VkInstance and contains function pointers for instance-level commands. The device dispatch table is indexed by VkDevice and contains device-level function pointers.

13.3.11 Extension Hook Points¶

The loader intercepts certain Vulkan functions that require Android-specific behavior. These "proc hooks" are defined for extensions like:

Extension	Hooked Functions	Android Behavior
`VK_KHR_surface`	`vkCreateAndroidSurfaceKHR`	Wraps ANativeWindow
`VK_KHR_swapchain`	`vkCreateSwapchainKHR`	Maps to BufferQueue
`VK_GOOGLE_display_timing`	`vkGetPastPresentationTimingGOOGLE`	Queries frame stats
`VK_EXT_debug_report`	All debug callbacks	Routes to logcat

13.3.12 Vulkan Instance Creation Flow¶

sequenceDiagram
    participant App as Application
    participant API as api.cpp
    participant Driver as driver.cpp
    participant HAL as Vendor HAL

    App->>API: vkCreateInstance()
    API->>API: OverrideLayerNames::Parse()
    Note over API: Inject implicit layers<br/>from debug.vulkan.layers

    API->>API: OverrideExtensionNames::Parse()
    Note over API: Add VK_EXT_debug_report<br/>if debug layer present

    API->>Driver: CreateInfoWrapper::Validate()
    Note over Driver: Sanitize API version<br/>Filter extensions<br/>Clean pNext chain

    Driver->>HAL: Hal::Get().Device()<br/>.EnumerateInstanceExtensionProperties()
    HAL-->>Driver: Available extensions

    Driver->>HAL: vkCreateInstance()
    HAL-->>Driver: VkInstance

    Driver->>Driver: Store instance dispatch table
    Driver-->>API: VkInstance
    API-->>App: VkInstance

13.3.13 Physical Device Enumeration¶

The Vulkan loader enumerates physical devices from the HAL:

// driver.cpp (in setupDevice, continued from line 197)
uint32_t gpuCount;
mEnumeratePhysicalDevices(mInstance, &gpuCount, nullptr);
// Just returning the first physical device

Android typically has a single physical device (the mobile GPU). Multi-GPU configurations are not common on mobile devices, so the loader simply selects the first available device.

13.3.14 Queue Family Selection¶

VulkanManager selects queue families that support graphics operations. The queue selection also considers the VK_EXT_global_priority extension for requesting elevated GPU scheduling priority:

// VulkanManager.cpp (sEnableExtensions)
VK_EXT_GLOBAL_PRIORITY_EXTENSION_NAME,
VK_EXT_GLOBAL_PRIORITY_QUERY_EXTENSION_NAME,
VK_KHR_GLOBAL_PRIORITY_EXTENSION_NAME,

This allows HWUI's rendering queue to have higher priority than background compute workloads.

13.4 ANGLE¶

13.4.1 GL-on-Vulkan Translation¶

ANGLE (Almost Native Graphics Layer Engine) is Google's implementation of OpenGL ES on top of Vulkan. In AOSP, it lives at external/angle/ and serves as an alternative GLES driver that translates OpenGL ES calls into Vulkan commands.

graph TD
    A["App GLES Calls"] --> B["libEGL_angle.so"]
    B --> C["ANGLE EGL<br/>Implementation"]
    C --> D["ANGLE GLES<br/>→ Vulkan Translator"]
    D --> E["Vulkan Commands"]
    E --> F["Vendor Vulkan<br/>Driver"]
    F --> G["GPU"]

    style B fill:#4CAF50,color:#fff
    style D fill:#FF9800,color:#fff
    style F fill:#2196F3,color:#fff

13.4.2 When ANGLE Is Used¶

ANGLE is selected through the EGL loader integration. The egl_platform_entries.cpp file includes EGL/eglext_angle.h (line 44), indicating ANGLE-specific extension support. The selection happens based on:

Per-app opt-in via the ANGLE preference UI in developer settings
System-wide ANGLE enablement via ro.hardware.egl property
Game driver selection through GraphicsEnv

13.4.3 Benefits of ANGLE¶

Driver consistency: Same GLES behavior across different GPU vendors
Bug isolation: GLES bugs can be fixed in ANGLE without vendor driver updates
Feature emulation: ANGLE can emulate GLES extensions using Vulkan features
Updatability: ANGLE can be updated via Google Play system updates

13.4.4 ANGLE Architecture¶

ANGLE translates at the command level, not the shader level:

GLES state tracking in the "front-end"
Vulkan command buffer recording in the "back-end"
SPIRV-Cross for GLSL-to-SPIR-V shader translation
Efficient resource management (texture, buffer, render pass)

13.5 Skia¶

13.5.1 Skia's Role in Android¶

Skia (external/skia/) is the 2D graphics library that powers nearly all rendering in Android. It provides:

Path rendering (curves, fills, strokes)
Text layout and rasterization
Image decoding and sampling
GPU-accelerated rendering via its "Ganesh" backend
Color management (wide gamut, HDR)

graph TD
    subgraph "Skia Architecture"
        A["SkCanvas<br/>(API Surface)"]
        B["SkPaint / SkPath<br/>(Primitives)"]
        C["SkSL<br/>(Shader Language)"]

        subgraph "GPU Backends"
            D["Ganesh<br/>(Production)"]
            E["Graphite<br/>(Next-gen)"]
        end

        subgraph "Ganesh Sub-backends"
            F["GL Backend"]
            G["Vulkan Backend"]
            H["Metal Backend"]
        end

        A --> D
        A --> E
        D --> F
        D --> G
        D --> H
        B --> A
        C --> D
    end

    style D fill:#FF9800,color:#fff
    style E fill:#9C27B0,color:#fff

13.5.2 Core API (`include/core/`)¶

Skia's public API is defined in external/skia/include/core/. Key classes:

SkCanvas: The drawing surface. All draw commands go through this.
SkPaint: Describes how to draw (color, style, blend mode, shader, etc.)
SkPath: Geometric path data (moves, lines, curves, arcs)
SkImage: Immutable image data (can be GPU-backed)
SkSurface: A writable drawing target (wraps a canvas)
SkShader: Per-pixel color generation (gradients, images, custom)
SkColorSpace: ICC profile-based color management
SkMatrix / SkM44: 2D and 3D transformation matrices

13.5.3 Ganesh GPU Backend (`src/gpu/ganesh/`)¶

Ganesh is Skia's current production GPU backend. It translates SkCanvas draw calls into GPU commands using either OpenGL or Vulkan. Key concepts:

GrDirectContext: The GPU context that owns all GPU resources.

// Used by RenderThread to create the Skia GPU context
// frameworks/base/libs/hwui/renderthread/RenderThread.cpp, line 232
sk_sp<GrDirectContext> grContext(
    GrDirectContexts::MakeGL(std::move(glInterface), options));

GrContextOptions: Configuration for the GPU context, set by HWUI in RenderThread.cpp (line 255):

// frameworks/base/libs/hwui/renderthread/RenderThread.cpp, line 255
void RenderThread::initGrContextOptions(GrContextOptions& options) {
    options.fPreferExternalImagesOverES3 = true;
    options.fDisableDistanceFieldPaths = true;
    if (android::base::GetBoolProperty(
            PROPERTY_REDUCE_OPS_TASK_SPLITTING, true)) {
        options.fReduceOpsTaskSplitting = GrContextOptions::Enable::kYes;
    }
}

Render passes (OpsTask): Ganesh batches draw calls into render passes and reorders them to minimize state changes and render target switches. The fReduceOpsTaskSplitting option controls how aggressively Ganesh merges render passes.

13.5.4 Graphite: The Next-Generation Backend¶

Graphite (src/gpu/graphite/) is Skia's next-generation GPU backend, designed to take better advantage of modern explicit APIs (Vulkan, Metal, D3D12). Key differences from Ganesh:

Aspect	Ganesh	Graphite
Recording	Immediate	Deferred
Thread model	Single-threaded GPU work	Multi-threaded recording
Command buffers	Implicit	Explicit
Pipeline state	Lazy	Pre-compiled
Resource management	GC-based	Explicit ownership

Graphite is not yet the default for Android HWUI but is under active development.

13.5.5 SkSL: Skia's Shading Language¶

SkSL is Skia's custom shading language that compiles to GLSL, SPIR-V, or MSL depending on the backend. It powers:

Runtime shader effects (SkRuntimeEffect)
Custom blend modes
Color filters and image filters
The SkSL::Compiler translates SkSL into the target GPU shading language

13.5.6 Codecs and Image Decoding¶

Skia includes codecs for PNG, JPEG, WebP, GIF, BMP, ICO, and WBMP. These are used by BitmapFactory (via HWUI's JNI layer) to decode images. The codec system is in src/codec/ and integrates with Android's ImageDecoder API.

13.5.7 Text Rendering¶

Skia handles glyph rasterization using:

FreeType: Outline and bitmap glyph rendering
HarfBuzz: Complex text shaping (handled by minikin on Android)
GPU glyph atlas: Ganesh maintains a texture atlas for cached glyphs, with the atlas size configured by HWUI's CacheManager (see Section 9.7.4)

13.5.8 SIMD Optimizations¶

Skia uses SIMD instructions extensively for CPU-side operations:

NEON (ARM): Used for blending, color conversion, image sampling
SSE/AVX (x86): Used for the same operations on x86 devices
Code paths are selected at compile time based on target architecture
Located primarily in src/opts/

13.5.9 Skia's Recording and Playback Model¶

Skia supports both immediate-mode rendering (draw directly to GPU) and recording mode (record to SkPicture for later playback). HWUI uses the recording model:

graph TD
    A["SkPictureRecorder"] --> B["beginRecording()"]
    B --> C["SkCanvas*<br/>(recording canvas)"]
    C --> D["draw commands<br/>(drawRect, drawPath, ...)"]
    D --> E["finishRecordingAsPicture()"]
    E --> F["sk_sp&lt;SkPicture&gt;"]

    G["Playback"] --> H["canvas->drawPicture(picture)"]
    H --> I["Replays all recorded<br/>commands on target canvas"]

    style A fill:#4CAF50,color:#fff
    style F fill:#2196F3,color:#fff

The recording approach enables:

Deferred rendering (record on UI thread, render on RenderThread)
Display list caching (re-render without re-recording)
Serialization (save/load for debugging with SKP files)

13.5.10 GPU Resource Management in Ganesh¶

Ganesh manages GPU resources through a resource cache:

graph TD
    A["SkImage (CPU data)"] -->|"makeTextureImage()"| B["GrTexture<br/>(GPU texture)"]
    B --> C["GrResourceCache"]
    C --> D{"Referenced?"}
    D -->|Yes| E["Keep alive"]
    D -->|No| F{"Budget<br/>exceeded?"}
    F -->|Yes| G["Purge (LRU)"]
    F -->|No| H["Keep cached"]

    style C fill:#FF9800,color:#fff

The resource cache budget is set by HWUI's CacheManager:

// CacheManager.cpp, line 87
mGrContext->setResourceCacheLimit(mMaxResourceBytes);

Resources are classified as:

Scratch resources: Can be reused for any purpose (render targets, vertex buffers)
Unique resources: Tied to specific content (textures, shader programs)

13.5.11 Skia's Path Rendering¶

Path rendering is one of Skia's most complex subsystems. For GPU rendering, paths are tessellated into triangles:

graph LR
    A["SkPath<br/>(moveTo, lineTo,<br/>cubicTo, close)"] --> B["Tessellator"]
    B --> C["Triangle mesh"]
    C --> D["Vertex buffer"]
    D --> E["GPU draw call"]

    style A fill:#4CAF50,color:#fff
    style E fill:#2196F3,color:#fff

Ganesh uses several strategies depending on path complexity:

Simple convex paths: Direct tessellation
Complex paths: Stencil-then-cover algorithm
Small paths: Rasterized to a mask texture
Distance field paths: SDF-based rendering for resolution-independent paths

HWUI disables distance field paths:

// RenderThread.cpp, line 257
options.fDisableDistanceFieldPaths = true;

13.5.12 SkSurface and Rendering Targets¶

SkSurface represents a drawing destination. In HWUI, surfaces wrap GPU rendering targets:

For SkiaGL: The surface wraps the EGL default framebuffer (FBO 0):

// SkiaOpenGLPipeline.cpp
surface = SkSurfaces::WrapBackendRenderTarget(
    mRenderThread.getGrContext(), backendRT,
    getSurfaceOrigin(), colorType,
    mSurfaceColorSpace, &props);

For SkiaVulkan: The surface wraps a Vulkan swapchain image:

// SkiaVulkanPipeline.cpp
backBuffer = mVkSurface->getCurrentSkSurface();

For offscreen layers: Surfaces are created as GPU render targets:

// SkiaGpuPipeline.cpp
node->setLayerSurface(SkSurfaces::RenderTarget(
    mRenderThread.getGrContext(),
    skgpu::Budgeted::kYes, info, 0,
    this->getSurfaceOrigin(), &props));

13.5.13 Text Atlas Management¶

Skia maintains GPU texture atlases for cached glyph images. The atlas configuration in HWUI:

// CacheManager.cpp
contextOptions->fGlyphCacheTextureMaximumBytes =
    mMaxGpuFontAtlasBytes;

The atlas size is derived from the screen area:

mMaxGpuFontAtlasBytes = nextPowerOfTwo(screenWidth * screenHeight)

For a 1080x2400 display: nextPowerOfTwo(2592000) = 4194304 (4 MB per atlas)

Multiple atlases may be allocated:

A8 atlas for grayscale glyphs
ARGB atlas for color emoji
Distance field atlas for small text (if enabled)

13.6 HWUI¶

13.6.1 HWUI's Purpose¶

HWUI (Hardware UI) is the native rendering library that bridges Android's Java View system with the GPU. It lives in frameworks/base/libs/hwui/ and contains 488 files spanning canvas recording, display list management, render node properties, animation, and GPU pipeline integration.

graph TD
    subgraph "HWUI Architecture"
        A["Java View System"]
        B["Canvas.h<br/>(Recording API)"]
        C["RecordingCanvas<br/>(SkiaRecordingCanvas)"]
        D["SkiaDisplayList"]
        E["RenderNode"]
        F["RenderProperties"]
        G["RenderThread"]
        H["SkiaPipeline<br/>(GL or Vulkan)"]
        I["Skia (Ganesh)"]
    end

    A --> B
    B --> C
    C --> D
    D --> E
    E --> F
    E --> G
    G --> H
    H --> I

    style A fill:#4CAF50,color:#fff
    style G fill:#2196F3,color:#fff
    style I fill:#FF9800,color:#fff

13.6.2 The `Canvas` Interface¶

The abstract Canvas class in hwui/Canvas.h defines the full drawing API that Java android.graphics.Canvas maps to. It includes:

Recording API (used by the View system):

// frameworks/base/libs/hwui/hwui/Canvas.h, line 94
static WARN_UNUSED_RESULT Canvas* create_recording_canvas(
    int width, int height,
    uirenderer::RenderNode* renderNode = nullptr);

// frameworks/base/libs/hwui/hwui/Canvas.h, line 127
virtual void resetRecording(int width, int height,
    uirenderer::RenderNode* renderNode = nullptr) = 0;
virtual void finishRecording(
    uirenderer::RenderNode* destination) = 0;

Drawing primitives -- over 40 virtual methods covering:

// frameworks/base/libs/hwui/hwui/Canvas.h (selection)
virtual void drawColor(int color, SkBlendMode mode) = 0;
virtual void drawRect(float l, float t, float r, float b,
                      const Paint& paint) = 0;
virtual void drawRoundRect(float l, float t, float r, float b,
                           float rx, float ry, const Paint& paint) = 0;
virtual void drawCircle(float x, float y, float radius,
                        const Paint& paint) = 0;
virtual void drawPath(const SkPath& path, const Paint& paint) = 0;
virtual void drawBitmap(Bitmap& bitmap, float left, float top,
                        const Paint* paint) = 0;
virtual void drawRenderNode(
    uirenderer::RenderNode* renderNode) = 0;

View system operations (not exposed in public API):

virtual void enableZ(bool enableZ) = 0;
virtual void drawLayer(
    uirenderer::DeferredLayerUpdater* layerHandle) = 0;
virtual void drawWebViewFunctor(int functor) { }
virtual void punchHole(const SkRRect& rect, float alpha) = 0;

13.6.3 Canvas Op Types¶

The canvas operations that can be recorded are enumerated in CanvasOpTypes.h:

// frameworks/base/libs/hwui/canvas/CanvasOpTypes.h, line 23
enum class CanvasOpType : int8_t {
    // State ops
    Save, SaveLayer, SaveBehind, Restore, BeginZ, EndZ,

    // Clip ops
    ClipRect, ClipPath,

    // Drawing ops
    DrawColor, DrawRect, DrawRegion, DrawRoundRect,
    DrawRoundRectProperty, DrawDoubleRoundRect,
    DrawCircleProperty, DrawRippleDrawable, DrawCircle,
    DrawOval, DrawArc, DrawPaint, DrawPoint, DrawPoints,
    DrawPath, DrawLine, DrawLines, DrawVertices,
    DrawImage, DrawImageRect, DrawImageLattice,
    DrawPicture, DrawLayer, DrawRenderNode,

    COUNT
};

13.6.4 RenderNode: The View Tree Mirror¶

RenderNode (RenderNode.h, 452 lines) is the native counterpart of a Java View. Each View in the UI hierarchy has a corresponding RenderNode that stores:

RenderProperties -- visual properties (position, transform, alpha, clip, etc.)
DisplayList -- recorded drawing commands
AnimatorManager -- active property animations

// frameworks/base/libs/hwui/RenderNode.h, line 77
class RenderNode : public VirtualLightRefBase {
public:
    enum DirtyPropertyMask {
        GENERIC       = 1 << 1,
        TRANSLATION_X = 1 << 2,
        TRANSLATION_Y = 1 << 3,
        TRANSLATION_Z = 1 << 4,
        SCALE_X       = 1 << 5,
        SCALE_Y       = 1 << 6,
        ROTATION      = 1 << 7,
        ROTATION_X    = 1 << 8,
        ROTATION_Y    = 1 << 9,
        X             = 1 << 10,
        Y             = 1 << 11,
        Z             = 1 << 12,
        ALPHA         = 1 << 13,
        DISPLAY_LIST  = 1 << 14,
    };
    // ...
};

The DirtyPropertyMask enum enables fine-grained dirty tracking. When a View property changes (e.g., setTranslationX()), only the corresponding bit is set, avoiding unnecessary work during the sync phase.

13.6.5 Double-Buffered Properties¶

RenderNode uses a double-buffering scheme for thread safety. Properties are set by the UI thread on the "staging" copy, then synced to the "render" copy on the RenderThread:

// frameworks/base/libs/hwui/RenderNode.h, line 138
const RenderProperties& properties() const { return mProperties; }
RenderProperties& animatorProperties() { return mProperties; }
const RenderProperties& stagingProperties() { return mStagingProperties; }
RenderProperties& mutateStagingProperties() { return mStagingProperties; }

This pattern allows the UI thread and RenderThread to work concurrently without locks on the property data.

13.6.6 RenderProperties: The Full Property Set¶

RenderProperties.h (627 lines) contains the complete set of visual properties for a RenderNode:

// frameworks/base/libs/hwui/RenderProperties.h, line 574
struct PrimitiveFields {
    int mLeft = 0, mTop = 0, mRight = 0, mBottom = 0;
    int mWidth = 0, mHeight = 0;
    int mClippingFlags = CLIP_TO_BOUNDS;
    SkColor mSpotShadowColor = SK_ColorBLACK;
    SkColor mAmbientShadowColor = SK_ColorBLACK;
    float mAlpha = 1;
    float mTranslationX = 0, mTranslationY = 0, mTranslationZ = 0;
    float mElevation = 0;
    float mRotation = 0, mRotationX = 0, mRotationY = 0;
    float mScaleX = 1, mScaleY = 1;
    float mPivotX = 0, mPivotY = 0;
    bool mHasOverlappingRendering = false;
    bool mPivotExplicitlySet = false;
    bool mMatrixOrPivotDirty = false;
    bool mProjectBackwards = false;
    bool mProjectionReceiver = false;
    bool mAllowForceDark = true;
    bool mClipMayBeComplex = false;
    Rect mClipBounds;
    Outline mOutline;
    RevealClip mRevealClip;
} mPrimitiveFields;

13.6.7 LayerProperties and Layer Promotion¶

A RenderNode can be "promoted" to an offscreen layer for composition. This happens when:

The node has a non-opaque alpha with overlapping rendering
An SkImageFilter is applied (blur, color matrix, etc.)
A stretch effect is active
WebView functors require a layer for clipping

// frameworks/base/libs/hwui/RenderProperties.h, line 552
bool promotedToLayer() const {
    return mLayerProperties.mType == LayerType::None &&
           fitsOnLayer() &&
           (mComputedFields.mNeedLayerForFunctors ||
            mLayerProperties.mImageFilter != nullptr ||
            mLayerProperties.getStretchEffect().requiresLayer() ||
            (!MathUtils::isZero(mPrimitiveFields.mAlpha) &&
             mPrimitiveFields.mAlpha < 1 &&
             mPrimitiveFields.mHasOverlappingRendering));
}

13.6.8 DisplayList: The Recorded Command Stream¶

DisplayList.h defines the container for recorded canvas operations. AOSP currently uses SkiaDisplayListWrapper as the active implementation:

// frameworks/base/libs/hwui/DisplayList.h, line 338
using DisplayList = SkiaDisplayListWrapper;

The SkiaDisplayListWrapper wraps a skiapipeline::SkiaDisplayList, which stores:

An SkPicture-like recording of Skia draw calls
References to child RenderNodes
References to AnimatedImageDrawables
WebView functor handles
Vector drawable references

There is also a MultiDisplayList variant (line 173) that supports both the Skia recording and a new CanvasOpBuffer format, indicating ongoing modernization of the display list system.

13.6.9 The Skia Display List Pipeline¶

graph TD
    A["View.draw(Canvas)"] --> B["SkiaRecordingCanvas"]
    B --> C["SkPictureRecorder"]
    C --> D["SkiaDisplayList"]
    D --> E["Child RenderNodes"]
    D --> F["SkDrawable references"]
    D --> G["WebView Functors"]

    H["RenderThread sync"] --> D
    H --> I["SkiaGpuPipeline.renderFrame()"]
    I --> J["RenderNodeDrawable.draw()"]
    J --> K["Replay SkPicture"]
    J --> L["Recurse into children"]

    style B fill:#4CAF50,color:#fff
    style I fill:#2196F3,color:#fff

13.7 RenderThread¶

13.7.1 The Dedicated Render Thread¶

The RenderThread is a singleton thread that handles all GPU rendering for an application. It is created once per process and manages the GPU context (GL or Vulkan), frame timing, and all rendering operations.

// frameworks/base/libs/hwui/renderthread/RenderThread.cpp, line 158
RenderThread& RenderThread::getInstance() {
    [[clang::no_destroy]] static sp<RenderThread> sInstance = []() {
        sp<RenderThread> thread = sp<RenderThread>::make();
        thread->start("RenderThread");
        return thread;
    }();
    gHasRenderThreadInstance = true;
    return *sInstance;
}

13.7.2 Initialization¶

When the RenderThread starts, it initializes several subsystems in initThreadLocals() (line 204):

// frameworks/base/libs/hwui/renderthread/RenderThread.cpp, line 204
void RenderThread::initThreadLocals() {
    setupFrameInterval();
    initializeChoreographer();
    mEglManager = new EglManager();
    mRenderState = new RenderState(*this);
    mVkManager = VulkanManager::getInstance();
    mCacheManager = new CacheManager(*this);
}

The thread runs at PRIORITY_DISPLAY priority (line 394) and integrates directly with the Choreographer for VSYNC timing.

13.7.3 The Thread Loop¶

The main loop in threadLoop() (line 393) follows a classic work-queue pattern:

// frameworks/base/libs/hwui/renderthread/RenderThread.cpp, line 393
bool RenderThread::threadLoop() {
    setpriority(PRIO_PROCESS, 0, PRIORITY_DISPLAY);
    Looper::setForThread(mLooper);
    if (gOnStartHook) {
        gOnStartHook("RenderThread");
    }
    initThreadLocals();

    while (true) {
        waitForWork();
        processQueue();
        // Handle VSYNC frame callbacks
        if (mPendingRegistrationFrameCallbacks.size() &&
            !mFrameCallbackTaskPending) {
            mVsyncSource->drainPendingEvents();
            mFrameCallbacks.insert(
                mPendingRegistrationFrameCallbacks.begin(),
                mPendingRegistrationFrameCallbacks.end());
            mPendingRegistrationFrameCallbacks.clear();
            requestVsync();
        }
        mCacheManager->onThreadIdle();
    }
    return false;
}

13.7.4 VSYNC Integration¶

The RenderThread listens for VSYNC signals via AChoreographer:

// frameworks/base/libs/hwui/renderthread/RenderThread.cpp, line 106
class ChoreographerSource : public VsyncSource {
public:
    virtual void requestNextVsync() override {
        AChoreographer_postVsyncCallback(
            mRenderThread->mChoreographer,
            RenderThread::extendedFrameCallback,
            mRenderThread);
    }
};

The VSYNC callback delivers timing data including the vsync ID, frame deadline, and frame interval:

// frameworks/base/libs/hwui/renderthread/RenderThread.cpp, line 58
void RenderThread::extendedFrameCallback(
    const AChoreographerFrameCallbackData* cbData, void* data) {
    // ...
    AVsyncId vsyncId = AChoreographerFrameCallbackData_getFrameTimelineVsyncId(
        cbData, preferredFrameTimelineIndex);
    int64_t frameDeadline =
        AChoreographerFrameCallbackData_getFrameTimelineDeadlineNanos(
            cbData, preferredFrameTimelineIndex);
    int64_t frameTimeNanos =
        AChoreographerFrameCallbackData_getFrameTimeNanos(cbData);
    int64_t frameInterval =
        AChoreographer_getFrameInterval(rt->mChoreographer);
    rt->frameCallback(vsyncId, frameDeadline, frameTimeNanos,
                      frameInterval);
}

13.7.5 EglManager¶

EglManager.cpp (789 lines) manages the EGL context for the SkiaGL pipeline. Key operations:

Initialization (line 109):

// frameworks/base/libs/hwui/renderthread/EglManager.cpp, line 109
void EglManager::initialize() {
    if (hasEglContext()) return;
    ATRACE_NAME("Creating EGLContext");
    mEglDisplay = eglGetDisplay(EGL_DEFAULT_DISPLAY);
    EGLint major, minor;
    eglInitialize(mEglDisplay, &major, &minor);
    initExtensions();
    loadConfigs();
    createContext();
    createPBufferSurface();
    makeCurrent(mPBufferSurface, nullptr, true);
    // ...
}

Config selection -- The EglManager loads four configurations for different pixel formats:

Config	Pixel Format	Use Case
`mEglConfig`	RGBA8888	Default rendering
`mEglConfigF16`	RGBA_F16	Wide color gamut / HDR
`mEglConfig1010102`	RGB10_A2	10-bit color
`mEglConfigA8`	R8	Alpha-only (masks)

Color space handling -- createSurface() (line 396) maps Android ColorMode to EGL color space attributes:

// frameworks/base/libs/hwui/renderthread/EglManager.cpp, line 466
switch (colorMode) {
    case ColorMode::Default:
        attribs[1] = EGL_GL_COLORSPACE_LINEAR_KHR;
        break;
    case ColorMode::Hdr:
        attribs[1] = EGL_GL_COLORSPACE_SCRGB_EXT;
        break;
    case ColorMode::WideColorGamut:
        attribs[1] = EGL_GL_COLORSPACE_DISPLAY_P3_PASSTHROUGH_EXT;
        break;
}

Fence synchronization -- fenceWait() (line 689) implements GPU-side fence waits using EGL_KHR_wait_sync:

// frameworks/base/libs/hwui/renderthread/EglManager.cpp, line 689
status_t EglManager::fenceWait(int fence) {
    if (EglExtensions.waitSync && EglExtensions.nativeFenceSync) {
        int fenceFd = ::dup(fence);
        EGLint attribs[] = {
            EGL_SYNC_NATIVE_FENCE_FD_ANDROID, fenceFd, EGL_NONE
        };
        EGLSyncKHR sync = eglCreateSyncKHR(mEglDisplay,
            EGL_SYNC_NATIVE_FENCE_ANDROID, attribs);
        eglWaitSyncKHR(mEglDisplay, sync, 0);
        eglDestroySyncKHR(mEglDisplay, sync);
    } else {
        // Fall back to CPU-side wait
        sync_wait(fence, -1);
    }
    return OK;
}

13.7.6 VulkanManager¶

VulkanManager.cpp is the Vulkan counterpart to EglManager. It is a singleton shared across threads (the RenderThread and the HardwareBitmapUploader thread):

// frameworks/base/libs/hwui/renderthread/VulkanManager.cpp, line 85
sp<VulkanManager> VulkanManager::getInstance() {
    std::lock_guard _lock{sLock};
    sp<VulkanManager> vulkanManager = sWeakInstance.promote();
    if (!vulkanManager.get()) {
        vulkanManager = new VulkanManager();
        sWeakInstance = vulkanManager;
    }
    return vulkanManager;
}

The VulkanManager enables 26 Vulkan extensions (line 49):

// frameworks/base/libs/hwui/renderthread/VulkanManager.cpp, line 49
static std::array<std::string_view, 26> sEnableExtensions{
    VK_KHR_EXTERNAL_MEMORY_CAPABILITIES_EXTENSION_NAME,
    VK_KHR_EXTERNAL_MEMORY_EXTENSION_NAME,
    VK_KHR_SURFACE_EXTENSION_NAME,
    VK_KHR_SWAPCHAIN_EXTENSION_NAME,
    VK_KHR_IMAGE_FORMAT_LIST_EXTENSION_NAME,
    VK_EXT_IMAGE_DRM_FORMAT_MODIFIER_EXTENSION_NAME,
    VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_EXTENSION_NAME,
    VK_EXT_QUEUE_FAMILY_FOREIGN_EXTENSION_NAME,
    VK_KHR_EXTERNAL_SEMAPHORE_FD_EXTENSION_NAME,
    VK_KHR_ANDROID_SURFACE_EXTENSION_NAME,
    VK_EXT_GLOBAL_PRIORITY_EXTENSION_NAME,
    VK_EXT_GLOBAL_PRIORITY_QUERY_EXTENSION_NAME,
    VK_KHR_GLOBAL_PRIORITY_EXTENSION_NAME,
    VK_EXT_DEVICE_FAULT_EXTENSION_NAME,
    VK_EXT_FRAME_BOUNDARY_EXTENSION_NAME,
    VK_ANDROID_FRAME_BOUNDARY_EXTENSION_NAME,
};

Device setup (line 125) follows the standard Vulkan initialization pattern: enumerate physical devices, select extensions, create a logical device:

// frameworks/base/libs/hwui/renderthread/VulkanManager.cpp, line 125
void VulkanManager::setupDevice() {
    constexpr VkApplicationInfo app_info = {
        VK_STRUCTURE_TYPE_APPLICATION_INFO,
        nullptr,
        "android framework",  // pApplicationName
        0,
        "android framework",  // pEngineName
        0,
        mAPIVersion,
    };
    // Enumerate instance extensions, create instance,
    // enumerate physical devices, create logical device...
}

13.7.7 CacheManager¶

CacheManager.cpp (364 lines) manages GPU memory budgets for the Skia GrDirectContext. It implements memory pressure responses at multiple levels:

// frameworks/base/libs/hwui/renderthread/CacheManager.cpp, line 122
void CacheManager::trimMemory(TrimLevel mode) {
    if (!mGrContext) return;
    mGrContext->flushAndSubmit(GrSyncCpu::kYes);

    if (mode >= TrimLevel::BACKGROUND) {
        mGrContext->freeGpuResources();
        SkGraphics::PurgeAllCaches();
        mRenderThread.destroyRenderingContext();
    } else if (mode == TrimLevel::UI_HIDDEN) {
        mGrContext->setResourceCacheLimit(mBackgroundResourceBytes);
        SkGraphics::SetFontCacheLimit(mBackgroundCpuFontCacheBytes);
        mGrContext->purgeUnlockedResources(
            toSkiaEnum(mMemoryPolicy.purgeScratchOnly));
        mGrContext->setResourceCacheLimit(mMaxResourceBytes);
        SkGraphics::SetFontCacheLimit(mMaxCpuFontCacheBytes);
    }
}

Cache sizing: The cache limits are derived from the screen resolution:

// frameworks/base/libs/hwui/renderthread/CacheManager.cpp, line 45
CacheManager::CacheManager(RenderThread& thread)
    : mRenderThread(thread), mMemoryPolicy(loadMemoryPolicy()) {
    mMaxSurfaceArea = static_cast<size_t>(
        (DeviceInfo::getWidth() * DeviceInfo::getHeight()) *
        mMemoryPolicy.initialMaxSurfaceAreaScale);
    setupCacheLimits();
}

// line 62
void CacheManager::setupCacheLimits() {
    mMaxResourceBytes = mMaxSurfaceArea *
        mMemoryPolicy.surfaceSizeMultiplier;
    mBackgroundResourceBytes = mMaxResourceBytes *
        mMemoryPolicy.backgroundRetentionPercent;
    mMaxGpuFontAtlasBytes = nextPowerOfTwo(mMaxSurfaceArea);
    mMaxCpuFontCacheBytes = std::max(
        mMaxGpuFontAtlasBytes * 4,
        SkGraphics::GetFontCacheLimit());
}

Deferred cleanup: On every idle tick, the CacheManager performs incremental resource cleanup:

// line 281
void CacheManager::onThreadIdle() {
    if (!mGrContext || mFrameCompletions.size() == 0) return;
    const nsecs_t now = systemTime(CLOCK_MONOTONIC);
    if ((now - mLastDeferredCleanup) > 25_ms) {
        mLastDeferredCleanup = now;
        // ...
        mGrContext->performDeferredCleanup(
            std::chrono::milliseconds(cleanupMillis),
            toSkiaEnum(mMemoryPolicy.purgeScratchOnly));
    }
}

13.7.8 GPU Context Lifecycle¶

stateDiagram-v2
    [*] --> NoContext
    NoContext --> GLContext : requireGlContext
    NoContext --> VkContext : requireVkContext
    GLContext --> NoContext : destroyRenderingContext
    VkContext --> NoContext : destroyRenderingContext
    GLContext --> GLContext : frame rendering
    VkContext --> VkContext : frame rendering

    note right of GLContext
        EglManager.initialize()
        GrDirectContexts::MakeGL
    end note

    note right of VkContext
        VulkanManager.initialize()
        VulkanManager.createContext()
    end note

The RenderThread lazily creates the GPU context on first use:

// frameworks/base/libs/hwui/renderthread/RenderThread.cpp, line 218
void RenderThread::requireGlContext() {
    if (mEglManager->hasEglContext()) return;
    mEglManager->initialize();
    sk_sp<const GrGLInterface> glInterface = GrGLMakeNativeInterface();
    GrContextOptions options;
    initGrContextOptions(options);
    cacheManager().configureContext(&options, glesVersion, size);
    sk_sp<GrDirectContext> grContext(
        GrDirectContexts::MakeGL(std::move(glInterface), options));
    setGrContext(grContext);
}

void RenderThread::requireVkContext() {
    if (vulkanManager().hasVkContext() && mGrContext) return;
    mVkManager->initialize();
    GrContextOptions options;
    initGrContextOptions(options);
    cacheManager().configureContext(&options, &vkDriverVersion,
                                   sizeof(vkDriverVersion));
    sk_sp<GrDirectContext> grContext =
        mVkManager->createContext(options);
    setGrContext(grContext);
}

13.8 End-to-End Frame Pipeline¶

13.8.1 The Complete Frame Journey¶

This section traces a single frame from View.invalidate() to photons leaving the display, referencing exact source files and line numbers.

sequenceDiagram
    participant App as App (UI Thread)
    participant VRI as ViewRootImpl
    participant RC as RecordingCanvas
    participant RN as RenderNode
    participant RP as RenderProxy
    participant DFT as DrawFrameTask
    participant RT as RenderThread
    participant CC as CanvasContext
    participant SP as SkiaPipeline
    participant Skia as Skia (Ganesh)
    participant GPU as GPU
    participant BQ as BufferQueue
    participant SF as SurfaceFlinger
    participant HWC as HWC

    App->>VRI: View.invalidate()
    Note over VRI: Schedules traversal<br/>for next VSYNC

    VRI->>VRI: Choreographer callback
    VRI->>VRI: performTraversals()
    VRI->>VRI: performDraw()

    VRI->>RC: Canvas canvas = node.beginRecording()
    App->>RC: canvas.drawRect(), drawText(), ...
    RC->>RC: Record into SkPictureRecorder
    VRI->>RN: node.endRecording()
    Note over RN: Staging DisplayList set

    VRI->>RP: RenderProxy.syncAndDrawFrame()
    RP->>DFT: drawFrame()
    DFT->>RT: postAndWait() [queue task]
    Note over App: UI thread BLOCKS

    RT->>DFT: run()
    DFT->>CC: syncFrameState(info)
    CC->>RN: prepareTree(info)
    Note over RN: Sync staging → render<br/>properties & display lists

    DFT-->>App: unblockUiThread()
    Note over App: UI thread UNBLOCKED

    CC->>SP: draw(solelyTextureViewUpdates)
    SP->>SP: getFrame() [dequeue buffer]
    SP->>SP: renderFrame()
    SP->>Skia: SkCanvas operations
    Skia->>GPU: GL/VK draw commands
    SP->>SP: FlushAndSubmit()
    SP->>SP: swapBuffers()
    SP->>BQ: eglSwapBuffers / vkQueuePresent

    BQ->>SF: Buffer available signal
    SF->>SF: Composite all layers
    SF->>HWC: setLayerBuffer()
    HWC->>HWC: Hardware composition
    HWC-->>SF: presentDisplay()

13.8.2 Phase 1: Recording (UI Thread)¶

Step 1: Invalidation. When View.invalidate() is called, the framework marks the View and its ancestors dirty. ViewRootImpl schedules a traversal callback with Choreographer.

Step 2: Traversal. On the next VSYNC, ViewRootImpl.performTraversals() is called. This triggers measure, layout, and draw passes.

Step 3: Recording. During the draw pass:

// View.java (simplified)
void updateDisplayListIfDirty() {
    RecordingCanvas canvas = renderNode.beginRecording(width, height);
    try {
        draw(canvas);  // View.draw(Canvas) - app code runs here
    } finally {
        renderNode.endRecording();
    }
}

The Canvas.create_recording_canvas() factory (in Canvas.h, line 94) creates a SkiaRecordingCanvas that wraps SkPictureRecorder. Every canvas.drawRect(), canvas.drawText(), etc. call is recorded into the SkPicture, not executed immediately.

13.8.3 Phase 2: Sync (RenderThread)¶

Step 4: Post and Wait. RenderProxy posts a DrawFrameTask to the RenderThread and blocks:

// frameworks/base/libs/hwui/renderthread/DrawFrameTask.cpp, line 82
void DrawFrameTask::postAndWait() {
    ATRACE_CALL();
    AutoMutex _lock(mLock);
    mRenderThread->queue().post([this]() { run(); });
    mSignal.wait(mLock);
}

Step 5: Frame State Sync. The RenderThread calls syncFrameState() (line 169):

// frameworks/base/libs/hwui/renderthread/DrawFrameTask.cpp, line 169
bool DrawFrameTask::syncFrameState(TreeInfo& info) {
    int64_t vsync = mFrameInfo[static_cast<int>(
        FrameInfoIndex::Vsync)];
    mRenderThread->timeLord().vsyncReceived(vsync, ...);
    bool canDraw = mContext->makeCurrent();
    mContext->unpinImages();

    // Apply deferred layer updates (TextureView, etc.)
    for (size_t i = 0; i < mLayers.size(); i++) {
        if (mLayers[i]) mLayers[i]->apply();
    }
    mLayers.clear();

    mContext->setContentDrawBounds(mContentDrawBounds);
    mContext->prepareTree(info, mFrameInfo, mSyncQueued, mTargetNode);
    // ...
}

prepareTree() walks the entire RenderNode tree, syncing staging properties and display lists to their render counterparts. After sync completes, the UI thread is unblocked:

// DrawFrameTask.cpp, line 125
if (canUnblockUiThread) {
    unblockUiThread();
}

13.8.4 Phase 3: Rendering (RenderThread)¶

Step 6: Draw. CanvasContext::draw() orchestrates the actual rendering:

// CanvasContext.cpp (simplified)
void CanvasContext::draw(bool solelyTextureViewUpdates) {
    Frame frame = mRenderPipeline->getFrame();
    SkRect dirty = computeDirtyRect(frame, ...);
    auto drawResult = mRenderPipeline->draw(
        frame, screenDirty, dirty, lightGeometry,
        &mLayerUpdateQueue, mContentDrawBounds,
        mOpaque, lightInfo, mRenderNodes, ...);
    bool


 requireSwap;
    mRenderPipeline->swapBuffers(frame, drawResult,
        screenDirty, currentFrameInfo, &requireSwap);
}

For the SkiaGL pipeline (SkiaOpenGLPipeline.cpp, line 116):

// frameworks/base/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp, line 116
IRenderPipeline::DrawResult SkiaOpenGLPipeline::draw(...) {
    mEglManager.damageFrame(frame, dirty);

    // Create an SkSurface wrapping the EGL default framebuffer
    GrGLFramebufferInfo fboInfo;
    fboInfo.fFBOID = 0;
    fboInfo.fFormat = GL_RGBA8;  // or GL_RGBA16F for HDR

    auto backendRT = GrBackendRenderTargets::MakeGL(
        frame.width(), frame.height(), 0, STENCIL_BUFFER_SIZE, fboInfo);
    sk_sp<SkSurface> surface = SkSurfaces::WrapBackendRenderTarget(
        mRenderThread.getGrContext(), backendRT,
        getSurfaceOrigin(), colorType, mSurfaceColorSpace, &props);

    LightingInfo::updateLighting(localGeometry, lightInfo);
    renderFrame(*layerUpdateQueue, dirty, renderNodes,
        opaque, contentDrawBounds, surface, preTransform);

    skgpu::ganesh::FlushAndSubmit(surface);
    return {true, ...};
}

For the SkiaVulkan pipeline (SkiaVulkanPipeline.cpp, line 74):

// frameworks/base/libs/hwui/pipeline/skia/SkiaVulkanPipeline.cpp, line 74
IRenderPipeline::DrawResult SkiaVulkanPipeline::draw(...) {
    sk_sp<SkSurface> backBuffer =
        mVkSurface->getCurrentSkSurface();
    SkMatrix preTransform =
        mVkSurface->getCurrentPreTransform();

    renderFrame(*layerUpdateQueue, dirty, renderNodes,
        opaque, contentDrawBounds, backBuffer, preTransform);

    auto drawResult = vulkanManager().finishFrame(
        backBuffer.get());
    return {true, drawResult.submissionTime,
            std::move(drawResult.presentFence)};
}

13.8.5 Phase 4: Presentation¶

Step 7: Swap Buffers. The completed frame is submitted to the BufferQueue:

For GL:

// EglManager.cpp, line 621
bool EglManager::swapBuffers(const Frame& frame,
                              const SkRect& screenDirty) {
    EGLint rects[4];
    frame.map(screenDirty, rects);
    eglSwapBuffersWithDamageKHR(mEglDisplay, frame.mSurface,
        rects, screenDirty.isEmpty() ? 0 : 1);
    // ...
}

For Vulkan:

// SkiaVulkanPipeline.cpp, line 130
bool SkiaVulkanPipeline::swapBuffers(...) {
    currentFrameInfo->markSwapBuffers();
    if (*requireSwap) {
        vulkanManager().swapBuffers(mVkSurface, screenDirty,
            std::move(drawResult.presentFence));
    }
    return *requireSwap;
}

Step 8: SurfaceFlinger Composition. SurfaceFlinger acquires the buffer from the BufferQueue, composites all visible layers (using RenderEngine for GPU composition or HWC for hardware overlay composition), and presents the result to the display.

13.8.6 Timing Budget¶

For a 60 FPS display (16.67ms frame budget):

gantt
    title Frame Timing Budget (16.67ms @ 60 FPS)
    dateFormat X
    axisFormat %L

    section UI Thread
    VSYNC arrival           :v1, 0, 0
    Input handling          :a1, 0, 2
    Animation callbacks     :a2, 2, 4
    Measure + Layout        :a3, 4, 6
    Draw (Record)           :a4, 6, 9
    Sync wait               :a5, 9, 10

    section RenderThread
    Sync frame state        :b1, 9, 10
    GPU draw commands       :b2, 10, 14
    Swap buffers            :b3, 14, 15

    section SurfaceFlinger
    Composite               :c1, 15, 16
    Present to HWC          :c2, 16, 17

13.9 SurfaceFlinger RenderEngine¶

13.9.1 What RenderEngine Does¶

SurfaceFlinger's RenderEngine performs GPU-based layer composition when the Hardware Composer (HWC) cannot handle all layers through hardware overlays. Common scenarios:

Layers with complex blend modes
Layers requiring color space conversion
More layers than HWC overlay planes support
Rounded corners or other visual effects

13.9.2 Skia-Based RenderEngine¶

Modern AOSP uses a Skia-based RenderEngine, replacing the legacy OpenGL-based implementation. This lives in frameworks/native/libs/renderengine/skia/.

graph TD
    A["SurfaceFlinger"] --> B["RenderEngine"]
    B --> C["SkiaRenderEngine"]
    C --> D["Skia (Ganesh)"]
    D --> E{"Backend"}
    E -->|GL| F["GL RenderEngine"]
    E -->|Vulkan| G["Vulkan RenderEngine"]
    F --> H["GPU"]
    G --> H

    style B fill:#9C27B0,color:#fff
    style C fill:#FF9800,color:#fff

13.9.3 RenderEngine Operations¶

RenderEngine handles:

Layer composition: Drawing each layer's buffer onto the output buffer
Color management: Converting between different layer color spaces
HDR tone-mapping: Mapping HDR content for SDR displays
Shadow rendering: Drawing window shadows below elevation
Blur effects: Background blur for notification shade, dialogs
Dim layers: System-level dimming overlays
Screenshot capture: Compositing visible layers for screenshots

13.9.4 Composition Flow¶

sequenceDiagram
    participant SF as SurfaceFlinger
    participant HWC as HWC HAL
    participant RE as RenderEngine

    SF->>HWC: validate(layers)
    HWC-->>SF: composition types<br/>(DEVICE, CLIENT, CURSOR)
    Note over SF: Some layers marked CLIENT

    SF->>RE: drawLayers(clientLayers)
    RE->>RE: For each CLIENT layer:
    RE->>RE: 1. Bind layer buffer as texture
    RE->>RE: 2. Apply color transform
    RE->>RE: 3. Draw to output buffer
    RE-->>SF: Composited output buffer

    SF->>HWC: setClientTarget(outputBuffer)
    SF->>HWC: presentDisplay()

13.9.5 HWC Layer Composition Types¶

The Hardware Composer classifies each layer into a composition type:

graph TD
    A["All Visible Layers"] --> B["HWC validate()"]
    B --> C{"HWC Decision"}
    C -->|DEVICE| D["Hardware Overlay<br/>(Direct scanout)"]
    C -->|CLIENT| E["GPU Composition<br/>(RenderEngine)"]
    C -->|CURSOR| F["Hardware Cursor<br/>(Dedicated plane)"]
    C -->|SIDEBAND| G["Sideband Stream<br/>(Video tunnel)"]

    D --> H["Display Controller"]
    E --> I["Client Target Buffer"]
    I --> H
    F --> H
    G --> H

    style D fill:#4CAF50,color:#fff
    style E fill:#FF9800,color:#fff
    style F fill:#2196F3,color:#fff

DEVICE composition is preferred because it avoids GPU work entirely. The display controller directly reads from the layer's buffer. This is used for:

Simple rectangular layers without complex blend modes
Video playback surfaces
Status bar and navigation bar

CLIENT composition falls back to GPU rendering when hardware capabilities are exceeded. Common triggers:

More layers than available hardware planes
Complex blend modes or color transforms
Non-rectangular clip regions
Layers requiring rotation that hardware cannot handle

13.9.6 RenderEngine Shader Pipeline¶

The Skia-based RenderEngine uses a custom shader pipeline for composition:

graph LR
    A["Layer Buffer<br/>(Texture)"] --> B["Vertex Shader<br/>(Position + UV)"]
    B --> C["Fragment Shader"]
    C --> D["Color Space<br/>Conversion"]
    D --> E["Tone Mapping<br/>(HDR→SDR)"]
    E --> F["Alpha Blend"]
    F --> G["Output Buffer"]

    style C fill:#FF9800,color:#fff
    style D fill:#2196F3,color:#fff

13.9.7 Triple Buffering and Buffer Management¶

The BufferQueue between the application and SurfaceFlinger typically maintains three buffers:

graph TD
    subgraph "Buffer States"
        A["Buffer A<br/>Being Displayed"]
        B["Buffer B<br/>Queued for Display"]
        C["Buffer C<br/>App Rendering"]
    end

    subgraph "Flow"
        D["App dequeues C"] --> E["App renders into C"]
        E --> F["App queues C"]
        F --> G["SF acquires B"]
        G --> H["SF displays B"]
        H --> I["SF releases A"]
        I --> D
    end

    style A fill:#4CAF50,color:#fff
    style B fill:#FF9800,color:#fff
    style C fill:#2196F3,color:#fff

This triple-buffering scheme ensures that:

The app always has a buffer to render to (no stalling)
SurfaceFlinger always has a buffer ready for display
Frames can be dropped without visible glitches

13.10 GPU Driver Interface¶

13.10.1 HAL Interfaces¶

The GPU driver interface is defined in hardware/interfaces/graphics/. The key HAL modules are:

graph TD
    subgraph "Graphics HAL Stack"
        A["IComposer<br/>(HWC HAL)"]
        B["IAllocator<br/>(Gralloc HAL)"]
        C["IMapper<br/>(Buffer Mapping)"]
        D["Vulkan HAL<br/>(hwvulkan)"]
        E["EGL/GLES<br/>(Vendor Driver)"]
    end

    F["SurfaceFlinger"] --> A
    F --> B
    F --> C

    G["HWUI / Apps"] --> D
    G --> E

    A --> H["Display Hardware"]
    B --> I["Memory Allocator"]
    D --> J["GPU Hardware"]
    E --> J

    style A fill:#F44336,color:#fff
    style B fill:#FF9800,color:#fff
    style D fill:#2196F3,color:#fff

13.10.2 The Gralloc Allocator¶

Buffer allocation is handled by the Gralloc HAL, defined via AIDL in hardware/interfaces/graphics/allocator/aidl/:

// hardware/interfaces/graphics/allocator/aidl/android/hardware/graphics/allocator/IAllocator.aidl
interface IAllocator {
    AllocationResult allocate(in BufferDescriptorInfo descriptor,
                              in int count);
    boolean isSupported(in BufferDescriptorInfo descriptor);
}

13.10.3 EGL Driver Loading¶

The EGL driver is loaded by Loader::open() in frameworks/native/opengl/libs/EGL/Loader.cpp. The loader searches for:

libEGL_<name>.so -- EGL implementation
libGLESv1_CM_<name>.so -- OpenGL ES 1.x implementation
libGLESv2_<name>.so -- OpenGL ES 2.0+ implementation

Where <name> comes from properties like ro.hardware.egl or the system board platform name.

13.10.4 Vulkan Driver Loading¶

As detailed in Section 9.3.2, the Vulkan driver is loaded via the hwvulkan HAL module. The driver library is named vulkan.<name>.so where <name> comes from:

// frameworks/native/vulkan/libvulkan/driver.cpp, line 145
const std::array<const char*, 2> HAL_SUBNAME_KEY_PROPERTIES = {{
    "ro.hardware.vulkan",
    "ro.board.platform",
}};

13.10.5 Updated/Game Driver Mechanism¶

Android supports updatable GPU drivers through the GraphicsEnv system:

graph TD
    A["App Launch"] --> B["GraphicsEnv"]
    B --> C{"Updated Driver<br/>Available?"}
    C -->|Yes| D["Load from<br/>updatable namespace"]
    C -->|No| E{"APEX Driver?"}
    E -->|Yes| F["Load from<br/>APEX namespace"]
    E -->|No| G["Load built-in<br/>vendor driver"]

    style D fill:#4CAF50,color:#fff
    style F fill:#FF9800,color:#fff
    style G fill:#2196F3,color:#fff

For Vulkan (driver.cpp, line 232):

int LoadUpdatedDriver(const hwvulkan_module_t** module) {
    auto ns = android::GraphicsEnv::getInstance().getDriverNamespace();
    if (!ns) return -ENOENT;
    android::GraphicsEnv::getInstance().setDriverToLoad(
        android::GpuStatsInfo::Driver::VULKAN_UPDATED);
    int result = LoadDriver(ns, "updatable gfx driver", module);
    if (result != 0) {
        LOG_ALWAYS_FATAL("couldn't find an updated Vulkan implementation");
    }
    return result;
}

13.10.6 The Hardware Composer HAL¶

The HWC HAL is the interface between SurfaceFlinger and the display hardware. It has evolved through several versions:

graph TD
    A["HWC 1.x<br/>(Legacy C API)"] --> B["HWC 2.x<br/>(HIDL)"]
    B --> C["HWC 3.x<br/>(AIDL)"]

    style A fill:#F44336,color:#fff
    style B fill:#FF9800,color:#fff
    style C fill:#4CAF50,color:#fff

The current AIDL-based HWC 3 interface is defined in hardware/interfaces/graphics/composer/aidl/. Key operations:

Operation	Description
`createDisplay`	Register a new display
`setLayerBuffer`	Assign a buffer to a layer
`setLayerBlendMode`	Set alpha blending mode
`setLayerDataspace`	Set layer color space
`setLayerTransform`	Set rotation/flip transform
`validate`	Classify layers for composition
`present`	Submit the final frame to display
`getReleaseFences`	Get fences for released buffers

13.10.7 Gralloc Buffer Allocation¶

All graphics buffers in Android are allocated through the Gralloc HAL. The allocation flow:

sequenceDiagram
    participant App as Application
    participant BQ as BufferQueue
    participant GA as GraphicBufferAllocator
    participant HAL as Gralloc HAL
    participant DMA as DMA-BUF / ION

    App->>BQ: dequeueBuffer()
    Note over BQ: No free buffers
    BQ->>GA: allocate(w, h, format, usage)
    GA->>HAL: IAllocator.allocate()
    HAL->>DMA: Allocate DMA buffer
    DMA-->>HAL: Buffer handle + fd
    HAL-->>GA: AllocationResult
    GA-->>BQ: GraphicBuffer
    BQ-->>App: Buffer ready

The BufferUsage flags determine where the buffer can be used:

Flag	Meaning
`GPU_TEXTURE`	Can be sampled as a texture
`GPU_RENDER_TARGET`	Can be rendered to
`COMPOSER_OVERLAY`	Can be used as HWC overlay
`CPU_READ_OFTEN`	Efficient CPU read access
`VIDEO_ENCODER`	Can be consumed by video encoder
`CAMERA`	Can be produced by camera HAL

13.10.8 Common AIDL Types¶

The common graphics types are defined in hardware/interfaces/graphics/common/aidl/. Key types include:

Type	Purpose
`PixelFormat`	Buffer pixel format (RGBA8888, RGBA_FP16, etc.)
`Dataspace`	Color space + transfer function + range
`BufferUsage`	Usage flags (GPU_TEXTURE, GPU_RENDER_TARGET, etc.)
`BlendMode`	Hardware composition blend modes
`Transform`	Display transforms (rotation, flip)
`Hdr`	HDR capability types (HLG, HDR10, Dolby Vision)
`ColorTransform`	Color correction matrix types

13.11 Try It: Trace a Frame¶

13.11.1 Using Perfetto to Trace Frame Rendering¶

Perfetto (the system-wide tracing tool) is the primary way to observe the graphics pipeline in action. The ATRACE calls scattered throughout the code (ATRACE_CALL(), ATRACE_NAME(), ATRACE_FORMAT()) produce trace events that Perfetto captures.

Step 1: Capture a trace with GPU and graphics categories.

# On a rooted device or emulator:
adb shell perfetto \
  -c - --txt \
  -o /data/misc/perfetto-traces/trace.perfetto-trace \
<<EOF
buffers: {
    size_kb: 63488
    fill_policy: RING_BUFFER
}
data_sources: {
    config {
        name: "linux.ftrace"
        ftrace_config {
            ftrace_events: "ftrace/print"
            atrace_categories: "gfx"
            atrace_categories: "view"
            atrace_categories: "hwui"
            atrace_categories: "input"
            atrace_apps: "com.example.myapp"
        }
    }
}
duration_ms: 10000
EOF

Step 2: Interact with the app during the 10-second capture window.

Step 3: Pull and analyze the trace.

adb pull /data/misc/perfetto-traces/trace.perfetto-trace .
# Open at https://ui.perfetto.dev

13.11.2 What to Look For in the Trace¶

In the Perfetto UI, you will see these key tracks:

graph LR
    subgraph "Perfetto Trace Tracks"
        A["UI Thread<br/>- Choreographer#doFrame<br/>- performTraversals<br/>- draw"]
        B["RenderThread<br/>- DrawFrames<br/>- syncFrameState<br/>- flush commands"]
        C["GPU Completion<br/>- Actual GPU work time"]
        D["SurfaceFlinger<br/>- onMessageInvalidate<br/>- composite"]
        E["HWC<br/>- present"]
    end

    A --> B
    B --> C
    C --> D
    D --> E

13.11.3 Key Trace Events¶

Trace Event	Source File	Meaning
`Choreographer#doFrame`	`Choreographer.java`	VSYNC-triggered frame start
`Record View#draw()`	`ViewRootImpl.java`	Canvas recording phase
`DrawFrames <vsyncId>`	`DrawFrameTask.cpp:91`	RenderThread frame start
`syncFrameState`	`DrawFrameTask.cpp:170`	Property/DL sync
`flush commands`	`SkiaOpenGLPipeline.cpp:181`	GPU command submission
`eglSwapBuffers`	`eglApi.cpp:260`	Buffer presentation
`dequeueBuffer`	`BufferQueueProducer.cpp`	Buffer acquisition
`queueBuffer`	`BufferQueueProducer.cpp`	Buffer completion

13.11.4 Measuring Frame Timing with `dumpsys gfxinfo`¶

# Enable frame stats collection
adb shell setprop debug.hwui.profile true

# Run your app, then:
adb shell dumpsys gfxinfo com.example.myapp

# Output includes per-frame timing:
# Draw    Prepare Process  Execute
# 1.20    0.82    5.43     3.21
# 0.98    0.73    4.87     2.95

The four columns correspond to:

Draw: UI thread recording time
Prepare: Sync time (texture uploads, etc.)
Process: RenderThread GPU command recording
Execute: GPU execution and swap time

13.11.5 GPU Memory Debugging¶

# Dump HWUI memory usage
adb shell dumpsys gfxinfo com.example.myapp meminfo

# Output shows:
# Pipeline=Skia (Vulkan)
# Memory policy:
#   Max surface area: 2764800
#   Max resource usage: 22.12MB (x8)
#   Background retention: 50%
# CPU Caches:
#   Bitmaps: 2.45 MB
#   Glyph Cache: 1.23 MB
# GPU Caches:
#   Textures: 15.67 MB
#   Buffers: 3.21 MB

13.11.6 Vulkan Validation Layers¶

Enable Vulkan validation for debugging:

# Enable validation layers
adb shell setprop debug.vulkan.layers VK_LAYER_KHRONOS_validation

# Or per-app via developer settings:
# Settings > Developer options > Graphics driver preferences
# Select the target app and enable "Vulkan validation"

13.11.7 GPU Rendering Profile Bars¶

The on-device GPU rendering profiler visualizes frame timing as color-coded bars:

# Enable via developer options or:
adb shell setprop debug.hwui.profile visual_bars

The bars show:

Blue: Draw (UI thread)
Purple: Prepare
Red: Process (RenderThread)
Orange: Execute (GPU + swap)
Green line: 16ms budget threshold

13.11.8 ANGLE Debugging¶

To force a specific app to use ANGLE:

# Enable ANGLE for a specific package
adb shell settings put global angle_gl_driver_selection_pkgs \
    com.example.myapp
adb shell settings put global angle_gl_driver_selection_values \
    angle

13.11.9 Inspecting the Render Pipeline¶

# Check which pipeline is active
adb shell getprop debug.hwui.renderer
# Returns: "skiavk" or "skiagl"

# Force a specific pipeline (requires reboot)
adb shell setprop debug.hwui.renderer skiavk
adb shell stop
adb shell start

13.11.10 Building and Testing Graphics Changes¶

When modifying HWUI:

# Build HWUI
cd frameworks/base/libs/hwui
mm -j$(nproc)

# Run HWUI unit tests
adb sync
adb shell /data/nativetest64/hwui_unit_tests/hwui_unit_tests

# Run rendering tests
adb shell am instrument -w \
    android.uirendering.cts/androidx.test.runner.AndroidJUnitRunner

When modifying the Vulkan loader:

# Build the Vulkan loader
cd frameworks/native/vulkan
mm -j$(nproc)

# Run loader tests
adb sync
adb shell /data/nativetest64/libvulkan_test/libvulkan_test

13.11.11 SKP Capture for Debugging¶

HWUI supports capturing Skia Picture (SKP) files that record all drawing commands for offline analysis:

# Enable SKP capture
adb shell setprop debug.hwui.capture_skp_enabled true

# Capture frames from a specific app
adb shell setprop debug.hwui.capture_skp_filename \
    /data/local/tmp/frame.skp

# Trigger capture (the next frame will be captured)
adb shell kill -10 $(pidof com.example.myapp)

# Pull the captured file
adb pull /data/local/tmp/frame.skp

# Analyze with Skia's viewer tool or https://debugger.skia.org

SKP files contain:

Every SkCanvas draw call with full parameters
All referenced SkImage data (bitmaps)
SkPaint state for each operation
Transform and clip state changes

This is invaluable for debugging rendering issues because you can replay the exact sequence of draw calls in Skia's debugger tool.

13.11.12 Overdraw Debugging¶

HWUI can visualize overdraw (regions drawn multiple times per frame):

# Enable overdraw visualization
adb shell setprop debug.hwui.overdraw show

# Color coding:
# No color    = drawn once (ideal)
# Blue        = drawn twice
# Green       = drawn three times
# Pink        = drawn four times
# Red         = drawn five or more times (problematic)

graph TD
    A["No Overdraw<br/>(1x draw)"] -->|"Normal"| B["Optimal Performance"]
    C["2x Overdraw<br/>(Blue)"] -->|"Common"| D["Usually Acceptable"]
    E["3x Overdraw<br/>(Green)"] -->|"Watch"| F["Consider Optimization"]
    G["4x+ Overdraw<br/>(Red)"] -->|"Issue"| H["Needs Optimization"]

    style A fill:#FFFFFF,color:#000
    style C fill:#6495ED,color:#fff
    style E fill:#4CAF50,color:#fff
    style G fill:#F44336,color:#fff

13.11.13 GPU Completion Timeline¶

For detailed GPU timing analysis:

# Enable GPU completion fence timestamps
adb shell setprop debug.hwui.profile true

# The timing data includes:
# - handlePlayback: Time to issue GPU commands
# - sync: Time for frame state sync
# - draw: Time for GPU command recording
# - dequeueBuffer: Time to acquire a buffer
# - queueBuffer: Time to submit a buffer

13.11.14 Inspecting BufferQueue State¶

# Dump BufferQueue state for all surfaces
adb shell dumpsys SurfaceFlinger --list

# Dump detailed layer info
adb shell dumpsys SurfaceFlinger

# This shows:
# - Layer name and bounds
# - Buffer size and format
# - Composition type (DEVICE/CLIENT)
# - Visible region
# - Damage region
# - Buffer queue state (slots, pending buffers)

13.11.15 Hardware Composer Debugging¶

# Dump HWC state
adb shell dumpsys SurfaceFlinger --hwc

# Shows for each display:
# - Active config (resolution, refresh rate)
# - Layer composition decisions
# - Hardware overlay usage
# - GPU fallback reasons

13.11.16 Tracing GPU Memory¶

# Trace GPU memory allocations
adb shell setprop debug.hwui.trace_gpu_resources true

# Or use Perfetto with GPU memory counters:
adb shell perfetto \
  -c - --txt \
  -o /data/misc/perfetto-traces/gpu_mem.perfetto-trace \
<<EOF
buffers: {
    size_kb: 32768
}
data_sources: {
    config {
        name: "android.gpu.memory"
    }
}
duration_ms: 5000
EOF

13.11.17 Forcing Specific Render Behavior¶

# Force all rendering through GPU composition (no HWC overlays)
adb shell service call SurfaceFlinger 1008 i32 1

# Disable GPU composition (force HWC overlays only)
adb shell service call SurfaceFlinger 1008 i32 0

# Show surface update flashes
adb shell service call SurfaceFlinger 1002

# These are useful for diagnosing composition-related issues

13.11.18 Interactive GPU Debugging with RenderDoc¶

For advanced GPU debugging, RenderDoc can be used on Android:

# Install RenderDoc server on device
adb install renderdoc-server.apk

# Connect from desktop RenderDoc application
# Capture individual frames
# Inspect:
#   - All GPU draw calls
#   - Shader source code
#   - Texture/buffer contents
#   - Pipeline state at each draw
#   - GPU timing per draw call

13.11.19 Monitoring Frame Drops¶

# Watch for jank in real-time
adb shell dumpsys gfxinfo com.example.myapp framestats

# Output includes per-frame columns:
# FLAGS|INTENDED_VSYNC|VSYNC|OLDEST_INPUT_EVENT|
# NEWEST_INPUT_EVENT|HANDLE_INPUT_START|
# ANIMATION_START|PERFORM_TRAVERSALS_START|
# DRAW_START|SYNC_QUEUED|SYNC_START|
# ISSUE_DRAW_COMMANDS_START|SWAP_BUFFERS|
# FRAME_COMPLETED|DEADLINE|GPU_COMPLETED

Each column is a nanosecond timestamp. The difference between consecutive columns reveals exactly where time was spent in each frame phase.

13.12 Deep Dive: Layer Rendering¶

13.12.1 Offscreen Layer Architecture¶

HWUI uses offscreen rendering layers for Views that need to be composited separately. This includes Views with non-1.0 alpha, image filters (blur, color matrix), or stretch effects. The SkiaGpuPipeline manages these layers in SkiaGpuPipeline.cpp.

graph TD
    A["RenderNode<br/>(LayerType::RenderLayer)"] --> B["SkSurface<br/>(GPU texture)"]
    B --> C["Render layer content<br/>into offscreen texture"]
    C --> D["Composite into parent<br/>with alpha/blend/filter"]

    E["RenderNode<br/>(promotedToLayer)"] --> F["Automatic Layer<br/>Promotion"]
    F --> B

    style A fill:#FF9800,color:#fff
    style E fill:#2196F3,color:#fff

13.12.2 Layer Creation and Sizing¶

Layers are created with dimensions rounded up to the nearest LAYER_SIZE boundary:

// frameworks/base/libs/hwui/pipeline/skia/SkiaGpuPipeline.cpp, line 72
bool SkiaGpuPipeline::createOrUpdateLayer(RenderNode* node,
        const DamageAccumulator& damageAccumulator,
        ErrorHandler* errorHandler) {
    const int surfaceWidth =
        ceilf(node->getWidth() / float(LAYER_SIZE)) * LAYER_SIZE;
    const int surfaceHeight =
        ceilf(node->getHeight() / float(LAYER_SIZE)) * LAYER_SIZE;

    SkSurface* layer = node->getLayerSurface();
    if (!layer || layer->width() != surfaceWidth ||
        layer->height() != surfaceHeight) {
        SkImageInfo info = SkImageInfo::Make(
            surfaceWidth, surfaceHeight,
            getSurfaceColorType(), kPremul_SkAlphaType,
            getSurfaceColorSpace());
        node->setLayerSurface(SkSurfaces::RenderTarget(
            mRenderThread.getGrContext(),
            skgpu::Budgeted::kYes, info, 0,
            this->getSurfaceOrigin(), &props));
        // ...
    }
}

13.12.3 Layer Rendering Sequence¶

The layer rendering pipeline processes all dirty layers before drawing the main frame:

// frameworks/base/libs/hwui/pipeline/skia/SkiaGpuPipeline.cpp, line 36
void SkiaGpuPipeline::renderLayersImpl(
        const LayerUpdateQueue& layers, bool opaque) {
    sk_sp<GrDirectContext> cachedContext;
    for (size_t i = 0; i < layers.entries().size(); i++) {
        RenderNode* layerNode = layers.entries()[i].renderNode.get();
        if (CC_UNLIKELY(layerNode->getLayerSurface() == nullptr)) {
            continue;
        }
        bool rendered = renderLayerImpl(
            layerNode, layers.entries()[i].damage);
        // Batch GPU context flushes
        GrDirectContext* currentContext = GrAsDirectContext(
            layerNode->getLayerSurface()
                ->getCanvas()->recordingContext());
        if (cachedContext.get() != currentContext) {
            if (cachedContext.get()) {
                ATRACE_NAME("flush layers (context changed)");
                cachedContext->flushAndSubmit();
            }
            cachedContext.reset(SkSafeRef(currentContext));
        }
    }
    if (cachedContext.get()) {
        ATRACE_NAME("flush layers");
        cachedContext->flushAndSubmit();
    }
}

13.12.4 Image Pinning¶

For hardware bitmaps, SkiaGpuPipeline pins images as GPU textures to ensure they are available during rendering:

// frameworks/base/libs/hwui/pipeline/skia/SkiaGpuPipeline.cpp, line 115
bool SkiaGpuPipeline::pinImages(
        std::vector<SkImage*>& mutableImages) {
    for (SkImage* image : mutableImages) {
        if (skgpu::ganesh::PinAsTexture(
                mRenderThread.getGrContext(), image)) {
            mPinnedImages.emplace_back(sk_ref_sp(image));
        } else {
            return false;
        }
    }
    return true;
}

13.12.5 Hardware Buffer Rendering¶

Both pipelines support rendering to AHardwareBuffer for off-screen rendering targets (used by SurfaceTexture, ImageReader, etc.):

// frameworks/base/libs/hwui/pipeline/skia/SkiaGpuPipeline.cpp, line 153
sk_sp<SkSurface> SkiaGpuPipeline::getBufferSkSurface(
        const HardwareBufferRenderParams& bufferParams) {
    auto bufferColorSpace = bufferParams.getColorSpace();
    if (mBufferSurface == nullptr || mBufferColorSpace == nullptr ||
        !SkColorSpace::Equals(mBufferColorSpace.get(),
                              bufferColorSpace.get())) {
        mBufferSurface = SkSurfaces::WrapAndroidHardwareBuffer(
            mRenderThread.getGrContext(), mHardwareBuffer,
            kTopLeft_GrSurfaceOrigin, bufferColorSpace,
            nullptr, true);
        mBufferColorSpace = bufferColorSpace;
    }
    return mBufferSurface;
}

13.13 Deep Dive: RenderNode Drawing¶

13.13.1 RenderNodeDrawable¶

The RenderNodeDrawable class (pipeline/skia/RenderNodeDrawable.cpp) is the bridge between the display list tree and Skia's drawing system. It implements SkDrawable and handles:

Z-order reordering for elevation and shadows
Projection of child nodes onto ancestor surfaces
Outline clipping (for rounded corners)
Layer composition with blend modes and filters

// frameworks/base/libs/hwui/pipeline/skia/RenderNodeDrawable.cpp, line 41
RenderNodeDrawable::RenderNodeDrawable(
        RenderNode* node, SkCanvas* canvas,
        bool composeLayer, bool inReorderingSection)
    : mRenderNode(node)
    , mRecordedTransform(canvas->getTotalMatrix())
    , mComposeLayer(composeLayer)
    , mInReorderingSection(inReorderingSection) {}

13.13.2 Backwards Projection¶

Android's View system supports "projection" -- a child View can project its rendering onto an ancestor's surface. This is used for ripple effects that extend beyond the View's bounds:

// RenderNodeDrawable.cpp, line 54
void RenderNodeDrawable::drawBackwardsProjectedNodes(
        SkCanvas* canvas, const SkiaDisplayList& displayList,
        int nestLevel) const {
    for (auto& child : displayList.mChildNodes) {
        if (!child.getRenderNode()->isRenderable()) continue;
        const RenderProperties& childProperties =
            child.getNodeProperties();
        if (childProperties.getProjectBackwards() &&
            nestLevel > 0) {
            SkAutoCanvasRestore acr2(canvas, true);
            canvas->concat(child.getRecordedMatrix());
            child.drawContent(canvas);
        }
        // Recurse into sub-nodes...
    }
}

13.13.3 Outline Clipping¶

RenderNode outline clipping supports rectangles, rounded rectangles, and arbitrary paths:

// RenderNodeDrawable.cpp, line 89
static void clipOutline(const Outline& outline,
        SkCanvas* canvas, const SkRect* pendingClip) {
    Rect possibleRect;
    float radius;
    if (!outline.getAsRoundRect(&possibleRect, &radius)) {
        if (pendingClip) canvas->clipRect(*pendingClip);
        const SkPath* path = outline.getPath();
        if (path) {
            canvas->clipPath(*path, SkClipOp::kIntersect, true);
        }
        return;
    }
    SkRect rect = possibleRect.toSkRect();
    if (radius != 0.0f) {
        if (pendingClip && !pendingClip->contains(rect)) {
            canvas->clipRect(*pendingClip);
        }
        canvas->clipRRect(
            SkRRect::MakeRectXY(rect, radius, radius),
            SkClipOp::kIntersect, true);
    } else {
        if (pendingClip) (void)rect.intersect(*pendingClip);
        canvas->clipRect(rect);
    }
}

13.13.4 Z-Order and Reordering¶

Nodes with non-zero Z values (elevation) are drawn in a special reordering section. The onDraw method skips the draw if the node is in a reordering section but has zero Z:

// RenderNodeDrawable.cpp, line 125
void RenderNodeDrawable::onDraw(SkCanvas* canvas) {
    if ((!mInReorderingSection) ||
        MathUtils::isZero(mRenderNode->properties().getZ())) {
        this->forceDraw(canvas);
    }
}

Nodes with positive Z get shadows rendered first, then their content. Nodes with negative Z are drawn before their parent's content. This creates Android's Material Design elevation system.

13.14 Deep Dive: VulkanSurface¶

13.14.1 Surface Creation¶

VulkanSurface.cpp manages the integration between Vulkan and Android's native window system. When creating a surface, it connects to the native window and configures buffer management:

// frameworks/base/libs/hwui/renderthread/VulkanSurface.cpp, line 80
static bool ConnectAndSetWindowDefaults(ANativeWindow* window) {
    int err = native_window_api_connect(window,
        NATIVE_WINDOW_API_EGL);
    err = window->setSwapInterval(window, 1);
    err = native_window_set_shared_buffer_mode(window, false);
    err = native_window_set_auto_refresh(window, false);
    err = native_window_set_scaling_mode(window,
        NATIVE_WINDOW_SCALING_MODE_FREEZE);
    err = native_window_set_buffers_dimensions(window, 0, 0);
    // Enable auto prerotation for 90/270 degree rotation
    err = native_window_set_auto_prerotation(window, true);
    return true;
}

13.14.2 Pre-Transform Handling¶

Display rotation requires special handling in Vulkan. The VulkanSurface computes a pre-transform matrix that accounts for the display's current orientation:

// VulkanSurface.cpp, line 49
static SkMatrix GetPreTransformMatrix(
        SkISize windowSize, int transform) {
    const int width = windowSize.width();
    const int height = windowSize.height();
    switch (transform) {
        case 0:
            return SkMatrix::I();
        case ANATIVEWINDOW_TRANSFORM_ROTATE_90:
            return SkMatrix::MakeAll(
                0, -1, height, 1, 0, 0, 0, 0, 1);
        case ANATIVEWINDOW_TRANSFORM_ROTATE_180:
            return SkMatrix::MakeAll(
                -1, 0, width, 0, -1, height, 0, 0, 1);
        case ANATIVEWINDOW_TRANSFORM_ROTATE_270:
            return SkMatrix::MakeAll(
                0, 1, 0, -1, 0, width, 0, 0, 1);
    }
}

13.14.3 Pixel Snap Matrix¶

VulkanSurface also includes a "pixel snap" matrix that adds a small offset to prevent pixel-aligned geometry from falling on sub-pixel boundaries:

// VulkanSurface.cpp, line 68
SkM44 VulkanSurface::GetPixelSnapMatrix(
        SkISize windowSize, int transform) {
    static const SkScalar kOffset = 0.063f;  // ~1/16th pixel
    SkMatrix preRotation =
        GetPreTransformMatrix(windowSize, transform);
    SkMatrix invert;
    preRotation.invert(&invert);
    return SkM44::Translate(kOffset, kOffset)
        .postConcat(SkM44(preRotation))
        .preConcat(SkM44(invert));
}

This is a subtle but important optimization -- without the pixel snap, non-anti-aliased axis-aligned rectangles can produce hairline gaps due to floating-point precision issues.

13.15 Deep Dive: SkiaCanvas Implementation¶

13.15.1 The SkiaCanvas Class¶

SkiaCanvas (SkiaCanvas.h) is the concrete implementation of the Canvas abstract class. It wraps an SkCanvas and adds Android-specific features:

// frameworks/base/libs/hwui/SkiaCanvas.h, line 41
class SkiaCanvas : public Canvas {
public:
    explicit SkiaCanvas(const SkBitmap& bitmap);
    explicit SkiaCanvas(SkCanvas* canvas);

    // State operations
    virtual int getSaveCount() const override;
    virtual int save(SaveFlags::Flags flags) override;
    virtual void restore() override;

    // Drawing operations
    virtual void drawRect(float left, float top, float right,
        float bottom, const Paint& paint) override;
    virtual void drawRenderNode(
        uirenderer::RenderNode* renderNode) override;
    // ... 40+ more draw methods
};

13.15.2 The Paint Looper Pattern¶

SkiaCanvas implements a "looper" pattern for applying shadow/blur effects:

// SkiaCanvas.h, line 190
template <typename Proc>
void applyLooper(const Paint* paint, Proc proc,
                 void (*preFilter)(SkPaint&) = nullptr) {
    BlurDrawLooper* looper = paint ? paint->getLooper() : nullptr;
    Paint pnt = paint ? *paint : Paint();
    if (preFilter) preFilter(pnt);
    this->onFilterPaint(pnt);
    if (looper) {
        looper->apply(pnt,
            [&](SkPoint offset, const Paint& modifiedPaint) {
                mCanvas->save();
                mCanvas->translate(offset.fX, offset.fY);
                proc(modifiedPaint);
                mCanvas->restore();
            });
    } else {
        proc(pnt);
    }
}

This pattern draws the shadow layer first (with an offset and blur), then the foreground layer. It is used for text shadows and drop shadow effects.

13.15.3 Save Stack Management¶

SkiaCanvas maintains a save stack that tracks partial saves (saves that only preserve matrix or clip, not both):

// SkiaCanvas.h, line 210
struct SaveRec {
    int saveCount;
    SaveFlags::Flags saveFlags;
    size_t clipIndex;
};
std::unique_ptr<std::deque<SaveRec>> mSaveStack;
std::vector<Clip> mClipStack;

13.16 Deep Dive: RenderProxy and Thread Communication¶

13.16.1 The RenderProxy Pattern¶

RenderProxy (renderthread/RenderProxy.cpp) is the UI thread's handle to the RenderThread. It provides a type-safe interface for posting work:

// frameworks/base/libs/hwui/renderthread/RenderProxy.cpp, line 48
RenderProxy::RenderProxy(bool translucent,
        RenderNode* rootRenderNode,
        IContextFactory* contextFactory)
    : mRenderThread(RenderThread::getInstance()),
      mContext(nullptr) {
    pid_t uiThreadId = pthread_gettid_np(pthread_self());
    pid_t renderThreadId = getRenderThreadTid();
    mContext = mRenderThread.queue().runSync(
        [=, this]() -> CanvasContext* {
            return CanvasContext::create(mRenderThread,
                translucent, rootRenderNode, contextFactory,
                uiThreadId, renderThreadId);
        });
    mDrawFrameTask.setContext(
        &mRenderThread, mContext, rootRenderNode);
}

13.16.2 Synchronous vs Asynchronous Operations¶

RenderProxy uses two communication patterns:

Synchronous (runSync): Used when the UI thread needs a result.

bool RenderProxy::loadSystemProperties() {
    return mRenderThread.queue().runSync([this]() -> bool {
        bool needsRedraw = Properties::load();
        if (mContext->profiler().consumeProperties()) {
            needsRedraw = true;
        }
        return needsRedraw;
    });
}

Asynchronous (post): Used for fire-and-forget operations.

void RenderProxy::setSwapBehavior(SwapBehavior swapBehavior) {
    mRenderThread.queue().post(
        [this, swapBehavior]() {
            mContext->setSwapBehavior(swapBehavior);
        });
}

13.16.3 The DrawFrameTask Handoff¶

The most critical communication is DrawFrameTask.drawFrame(), which uses a mutex+condition variable for precise handoff:

sequenceDiagram
    participant UI as UI Thread
    participant Q as RenderThread Queue
    participant RT as RenderThread

    UI->>UI: DrawFrameTask.drawFrame()
    UI->>UI: mSyncResult = OK
    UI->>UI: mSyncQueued = now()
    UI->>Q: post(run)
    UI->>UI: mSignal.wait(mLock) [BLOCKED]

    RT->>RT: DrawFrameTask.run()
    RT->>RT: syncFrameState(info)
    Note over RT: Copy staging → render

    alt canUnblockUiThread
        RT-->>UI: mSignal.signal() [UNBLOCK]
        Note over UI: UI thread resumes
    end

    RT->>RT: context->draw()
    RT->>RT: GPU commands
    RT->>RT: swapBuffers()

    alt !canUnblockUiThread
        RT-->>UI: mSignal.signal() [UNBLOCK]
    end

The UI thread is typically unblocked as soon as the sync phase completes (before GPU work begins), allowing the next frame's measure/layout/record to overlap with the current frame's GPU rendering.

13.17 Deep Dive: Color Management¶

13.17.1 The Color Pipeline¶

Android's graphics stack supports wide color gamut and HDR rendering throughout the pipeline:

graph LR
    A["App Content<br/>(sRGB / P3 / BT2020)"] --> B["HWUI<br/>Color Mode"]
    B --> C["Skia<br/>SkColorSpace"]
    C --> D["EGL/Vulkan Surface<br/>Color Space Attrib"]
    D --> E["BufferQueue<br/>Dataspace"]
    E --> F["SurfaceFlinger<br/>Color Management"]
    F --> G["HWC<br/>Layer Dataspace"]
    G --> H["Display<br/>Panel Gamut"]

    style B fill:#4CAF50,color:#fff
    style F fill:#9C27B0,color:#fff

13.17.2 Color Modes in HWUI¶

HWUI supports multiple color modes, managed through EglManager.createSurface():

ColorMode	EGL Attribute	Surface Format	Use Case
`Default`	`EGL_GL_COLORSPACE_LINEAR_KHR`	RGBA8888	Standard sRGB
`WideColorGamut`	`EGL_GL_COLORSPACE_DISPLAY_P3_PASSTHROUGH_EXT`	RGBA8888	P3 content
`Hdr`	`EGL_GL_COLORSPACE_SCRGB_EXT`	RGBA_F16	HDR content
`Hdr10`	P3 passthrough + override	RGBA_1010102	HDR10 content
`A8`	None	R8	Alpha masks

13.17.3 Wide Color Gamut in Vulkan¶

The VulkanSurface also supports wide color gamut:

// VulkanSurface.cpp (in Create method)
// Color space is set on the Vulkan swapchain through
// VkSwapchainCreateInfoKHR::imageColorSpace
// The actual dataspace is set via
// ANativeWindow_setBuffersDataSpace()

13.17.4 HDR Override Workaround¶

The EglManager contains a notable workaround for HDR: since there is no standard EGL color space for extended-range P3, it overrides the dataspace after surface creation:

// EglManager.cpp, line 517
if (overrideWindowDataSpaceForHdr) {
    int32_t err = ANativeWindow_setBuffersDataSpace(
        window, P3_XRB);
    LOG_ALWAYS_FATAL_IF(err,
        "Failed to ANativeWindow_setBuffersDataSpace %d", err);
}

13.18 Deep Dive: Damage Tracking and Partial Updates¶

13.18.1 The Damage Region Concept¶

HWUI tracks which portions of the screen have changed (the "damage region") to minimize GPU work. Only the damaged region needs to be re-rendered.

13.18.2 Buffer Age¶

The EglManager implements buffer age tracking for partial updates:

// frameworks/base/libs/hwui/renderthread/EglManager.cpp, line 578
EGLint EglManager::queryBufferAge(EGLSurface surface) {
    switch (mSwapBehavior) {
        case SwapBehavior::Discard:
            return 0;  // Must redraw everything
        case SwapBehavior::Preserved:
            return 1;  // Previous frame preserved
        case SwapBehavior::BufferAge:
            EGLint bufferAge;
            eglQuerySurface(mEglDisplay, surface,
                EGL_BUFFER_AGE_EXT, &bufferAge);
            return bufferAge;  // Age of buffer contents
    }
    return 0;
}

Buffer age tells the renderer how old the buffer's contents are:

Age 0: Unknown/new buffer, must redraw everything
Age 1: Previous frame's content, only need to update damaged area
Age 2: Frame from 2 frames ago, need larger damage union
Age N: Frame from N frames ago

13.18.3 Damage and Swap¶

The damage region is communicated to the driver via EGL_KHR_partial_update:

// EglManager.cpp, line 604
void EglManager::damageFrame(const Frame& frame,
                              const SkRect& dirty) {
    if (EglExtensions.setDamage &&
        mSwapBehavior == SwapBehavior::BufferAge) {
        EGLint rects[4];
        frame.map(dirty, rects);
        eglSetDamageRegionKHR(mEglDisplay, frame.mSurface,
            rects, 1);
    }
}

And the swap is also performed with damage information:

// EglManager.cpp, line 621
bool EglManager::swapBuffers(const Frame& frame,
                              const SkRect& screenDirty) {
    EGLint rects[4];
    frame.map(screenDirty, rects);
    eglSwapBuffersWithDamageKHR(mEglDisplay, frame.mSurface,
        rects, screenDirty.isEmpty() ? 0 : 1);
}

13.19 Deep Dive: Animation and Frame Callbacks¶

13.19.1 The Animation Framework Integration¶

HWUI integrates with Android's animation framework through the AnimatorManager class. Each RenderNode has an AnimatorManager that handles property animations that run on the RenderThread:

// RenderNode.h, line 157
void addAnimator(const sp<BaseRenderNodeAnimator>& animator);
void removeAnimator(const sp<BaseRenderNodeAnimator>& animator);
AnimatorManager& animators() { return mAnimatorManager; }

13.19.2 Frame Callbacks¶

The RenderThread supports frame callbacks for custom rendering (e.g., TextureView):

// RenderThread.cpp, line 368
void RenderThread::dispatchFrameCallbacks() {
    ATRACE_CALL();
    mFrameCallbackTaskPending = false;
    std::set<IFrameCallback*> callbacks;
    mFrameCallbacks.swap(callbacks);
    if (callbacks.size()) {
        requestVsync();  // Pre-emptively request next VSYNC
        for (auto it = callbacks.begin();
             it != callbacks.end(); it++) {
            (*it)->doFrame();
        }
    }
}

13.19.3 VSYNC-Deadline Scheduling¶

The RenderThread uses a sophisticated scheduling algorithm that accounts for the frame deadline:

// RenderThread.cpp, line 73
void RenderThread::frameCallback(
        int64_t vsyncId, int64_t frameDeadline,
        int64_t frameTimeNanos, int64_t frameInterval) {
    mVsyncRequested = false;
    if (timeLord().vsyncReceived(
            frameTimeNanos, frameTimeNanos,
            vsyncId, frameDeadline, frameInterval) &&
        !mFrameCallbackTaskPending) {
        mFrameCallbackTaskPending = true;
        // Schedule work at 25% of the way to the deadline
        const auto timeUntilDeadline =
            deadlineTimePoint - frameTimeTimePoint;
        const auto runAt =
            (frameTimeTimePoint + (timeUntilDeadline / 4));
        queue().postAt(
            toNsecs_t(runAt.time_since_epoch()).count(),
            [this]() { dispatchFrameCallbacks(); });
    }
}

This scheduling at 25% of the deadline ensures that the RenderThread's frame work starts early enough to complete before the deadline, while also leaving time for the UI thread to process input events after the VSYNC.

13.20 Deep Dive: Shader Cache and Persistent Graphics Cache¶

13.20.1 ShaderCache¶

HWUI maintains a persistent shader cache via pipeline/skia/ShaderCache.h. This cache stores compiled GPU shader binaries so they do not need to be recompiled on subsequent app launches:

graph TD
    A["Skia requests<br/>shader compilation"] --> B["ShaderCache::store()"]
    B --> C["Write to disk<br/>(persistent)"]

    D["Skia needs<br/>cached shader"] --> E["ShaderCache::load()"]
    E --> F["Read from disk"]
    F --> G["Return compiled<br/>binary"]

    style B fill:#4CAF50,color:#fff
    style E fill:#2196F3,color:#fff

13.20.2 PersistentGraphicsCache¶

The PersistentGraphicsCache is an additional caching layer that Skia uses through its GrContextOptions::fPersistentCache interface:

// CacheManager.cpp, line 104
void CacheManager::configureContext(
        GrContextOptions* contextOptions,
        const void* identity, ssize_t size) {
    contextOptions->fAllowPathMaskCaching = true;
    contextOptions->fGlyphCacheTextureMaximumBytes =
        mMaxGpuFontAtlasBytes;
    contextOptions->fExecutor = &sDefaultExecutor;

    auto& shaderCache = skiapipeline::ShaderCache::get();
    shaderCache.initShaderDiskCache(identity, size);

    auto& graphicsCache =
        skiapipeline::PersistentGraphicsCache::get();
    contextOptions->fPersistentCache = &graphicsCache;
}

The identity parameter is the GLES version string (for GL) or the Vulkan driver version (for Vulkan), ensuring that cached shaders are invalidated when the driver changes.

13.20.3 Cache Executor¶

HWUI uses a CommonPoolExecutor for offloading Skia's background work (shader compilation, texture uploads) to a thread pool:

// CacheManager.cpp, line 97
class CommonPoolExecutor : public SkExecutor {
public:
    virtual void add(std::function<void(void)> func) override {
        CommonPool::post(std::move(func));
    }
};

13.21 Deep Dive: The Hint Session (ADPF)¶

13.21.1 Performance Hints¶

HWUI integrates with Android's Dynamic Performance Framework (ADPF) through the HintSessionWrapper. This allows the framework to communicate rendering workload predictions to the CPU/GPU governors:

// CanvasContext.cpp (constructor)
mHintSessionWrapper = std::make_shared<HintSessionWrapper>(
    uiThreadId, renderThreadId);

The hint session reports:

Expected frame completion time
Actual frame completion time
Thread IDs involved in rendering

This enables the platform to:

Boost CPU/GPU frequency for heavy frames
Reduce frequency for light frames
Migrate threads to appropriate CPU cores

13.22 Performance Characteristics and Design Principles¶

13.22.1 Key Design Decisions¶

Double-buffered properties: Staging properties on the UI thread, render properties on the RenderThread. No locks during the hot path.
Recording + replay: Canvas operations are recorded into display lists, then replayed by the RenderThread. This decouples app code from GPU submission.
Lazy GPU context creation: The GPU context is not created until the first frame needs to be rendered, saving memory for backgrounded apps.
Aggressive cache management: The CacheManager continuously prunes GPU resources based on screen size, memory pressure, and app lifecycle state.
Pre-rotation: VulkanSurface handles display rotation in the rendering transform rather than relying on the display controller, reducing composition overhead.
Fence-based synchronization: Native fences (EGL_ANDROID_native_fence_sync) enable GPU-to-GPU synchronization without CPU involvement.

13.22.2 Common Performance Pitfalls¶

Pitfall	Cause	Diagnosis
Jank on first frame	Shader compilation	Check for "shader compile" in Perfetto
High draw time	Too many draw calls	Reduce View hierarchy depth
Excessive layer creation	Alpha animations on complex Views	Set `hasOverlappingRendering=false`
GPU memory pressure	Too many large bitmaps	Profile with `dumpsys gfxinfo meminfo`
Texture upload stalls	Large images decoded on RenderThread	Use `prepareToDraw()` API
VSync misses	Long UI thread work	Move work off the UI thread

13.22.3 Pipeline Comparison¶

graph LR
    subgraph "SkiaGL Pipeline"
        A1["EglManager"] --> B1["EGL Context"]
        B1 --> C1["GrDirectContext<br/>(GL)"]
        C1 --> D1["SkSurface wrapping<br/>FBO 0"]
        D1 --> E1["eglSwapBuffers"]
    end

    subgraph "SkiaVulkan Pipeline"
        A2["VulkanManager"] --> B2["VkDevice"]
        B2 --> C2["GrDirectContext<br/>(Vulkan)"]
        C2 --> D2["SkSurface wrapping<br/>VkImage"]
        D2 --> E2["vkQueuePresentKHR"]
    end

    style A1 fill:#4CAF50,color:#fff
    style A2 fill:#2196F3,color:#fff

Aspect	SkiaGL	SkiaVulkan
Context creation	Faster	Slower (more setup)
Per-frame overhead	Higher (implicit state)	Lower (explicit state)
Shader compilation	Driver-dependent	SPIR-V (more predictable)
Multi-threaded recording	Limited	Better support
Memory management	Driver-managed	Explicit (via Skia)
Pre-rotation	Not supported	Supported (in swapchain)
Buffer age	Via EGL extension	Via VkSwapchain

13.23 Deep Dive: The CanvasContext Draw Flow¶

13.23.1 CanvasContext Lifecycle¶

The CanvasContext is the central coordinator for a window's rendering. Its lifecycle is tied to the window surface:

stateDiagram-v2
    [*] --> Created : CanvasContext create
    Created --> SurfaceSet : setSurface
    SurfaceSet --> Drawing : draw
    Drawing --> Drawing : subsequent frames
    Drawing --> Paused : pauseSurface
    Paused --> Drawing : resumeSurface
    Drawing --> Stopped : setStopped true
    Stopped --> Drawing : setStopped false
    Drawing --> SurfaceLost : surface destroyed
    SurfaceLost --> SurfaceSet : setSurface newWindow
    Stopped --> Destroyed : destroy
    SurfaceLost --> Destroyed : destroy
    Destroyed --> [*]

13.23.2 Surface Setup¶

When a new surface is provided, the CanvasContext configures the pipeline and the native window:

// frameworks/base/libs/hwui/renderthread/CanvasContext.cpp, line 216
void CanvasContext::setSurface(ANativeWindow* window,
                                bool enableTimeout) {
    startHintSession();
    if (window) {
        mNativeSurface =
            std::make_unique<ReliableSurface>(window);
        mNativeSurface->init();
        if (enableTimeout) {
            ANativeWindow_setDequeueTimeout(window, 4000_ms);
        }
    } else {
        mNativeSurface = nullptr;
    }
    setupPipelineSurface();
}

The ReliableSurface wrapper adds robustness to the native window by handling transient errors in dequeueBuffer and queueBuffer.

13.23.3 Pipeline Surface Configuration¶

// CanvasContext.cpp, line 268
void CanvasContext::setupPipelineSurface() {
    bool hasSurface = mRenderPipeline->setSurface(
        mNativeSurface ? mNativeSurface->getNativeWindow()
                       : nullptr,
        mSwapBehavior);

    if (mNativeSurface && !mNativeSurface->didSetExtraBuffers()) {
        setBufferCount(mNativeSurface->getNativeWindow());
    }

    mFrameNumber = 0;
    if (mNativeSurface != nullptr && hasSurface) {
        mHaveNewSurface = true;
        mSwapHistory.clear();
        native_window_enable_frame_timestamps(
            mNativeSurface->getNativeWindow(), true);
        native_window_set_scaling_mode(
            mNativeSurface->getNativeWindow(),
            NATIVE_WINDOW_SCALING_MODE_FREEZE);
    } else {
        mRenderThread.removeFrameCallback(this);
        mGenerationID++;
    }
}

13.23.4 Buffer Count Management¶

The buffer count is calculated based on the window's minimum undequeued buffers:

// CanvasContext.cpp, line 186
static void setBufferCount(ANativeWindow* window) {
    int query_value;
    int err = window->query(window,
        NATIVE_WINDOW_MIN_UNDEQUEUED_BUFFERS, &query_value);
    auto min_undequeued_buffers =
        static_cast<uint32_t>(query_value);
    // min_undequeued + 2 because renderahead was already
    // factored into the query
    int bufferCount = min_undequeued_buffers + 2;
    native_window_set_buffer_count(window, bufferCount);
}

Typically this results in 3 buffers (triple buffering): one being displayed, one being composited by SurfaceFlinger, and one being rendered to by the app.

13.23.5 The prepareTree Phase¶

prepareTree is the critical tree-walk that syncs all RenderNode properties and display lists:

graph TD
    A["CanvasContext::prepareTree()"] --> B["TreeInfo setup<br/>(MODE_FULL)"]
    B --> C["Root RenderNode<br/>prepareTree()"]
    C --> D["For each child node:"]
    D --> E["pushStagingPropertiesChanges()"]
    D --> F["pushStagingDisplayListChanges()"]
    D --> G["prepareLayer() if needed"]
    D --> H["Animate properties"]
    D --> I["Recurse into children"]

    E --> J["Copy staging props<br/>to render props"]
    F --> K["Swap staging DL<br/>to render DL"]
    G --> L["Create/resize<br/>offscreen layer"]

    style A fill:#2196F3,color:#fff
    style C fill:#4CAF50,color:#fff

13.23.6 Frame Skipping Logic¶

The CanvasContext can decide to skip rendering a frame under several conditions:

// DrawFrameTask.cpp, line 107
canDrawThisFrame = !info.out.skippedFrameReason.has_value();

Frames are skipped when:

No output target (surface lost)
Context is stopped (app backgrounded)
No content changes and no forced redraw

When a frame is skipped, any pending texture uploads are still flushed:

// DrawFrameTask.cpp, line 143
if (CC_LIKELY(canDrawThisFrame)) {
    context->draw(solelyTextureViewUpdates);
} else {
    // Flush pending texture uploads
    if (GrDirectContext* grContext =
            mRenderThread->getGrContext()) {
        grContext->flushAndSubmit();
    }
    context->waitOnFences();
}

13.24 Deep Dive: WebView Integration¶

13.24.1 WebView Functors¶

WebView uses a special rendering path because it has its own GL/Vulkan context. HWUI supports this through "functors" -- callbacks that WebView registers to draw its content:

// Canvas.h, line 150
virtual void drawWebViewFunctor(int /*functor*/) {
    LOG_ALWAYS_FATAL("Not supported");
}

13.24.2 VkInteropFunctorDrawable¶

When running on the Vulkan pipeline, WebView's GL content must be interoperated with Vulkan. The VkInteropFunctorDrawable class handles this translation:

graph TD
    A["HWUI Vulkan Pipeline"] --> B["VkInteropFunctorDrawable"]
    B --> C["Allocate shared<br/>AHardwareBuffer"]
    C --> D["WebView renders<br/>GL content<br/>into AHardwareBuffer"]
    D --> E["Import AHardwareBuffer<br/>as VkImage"]
    E --> F["Composite into<br/>Vulkan frame"]

    style A fill:#2196F3,color:#fff
    style D fill:#4CAF50,color:#fff

13.24.3 Functor Layer Requirements¶

When a WebView functor is present in the tree, HWUI may need to force layer creation for correct clipping behavior:

// RenderProperties.h, line 167
bool prepareForFunctorPresence(
        bool willHaveFunctor,
        bool ancestorDictatesFunctorsNeedLayer) {
    bool functorsNeedLayer =
        ancestorDictatesFunctorsNeedLayer ||
        CC_UNLIKELY(isClipMayBeComplex()) ||
        CC_UNLIKELY(getOutline().willComplexClip()) ||
        CC_UNLIKELY(getRevealClip().willClip()) ||
        CC_UNLIKELY(getTransformMatrix() &&
            !getTransformMatrix()->isScaleTranslate());
    mComputedFields.mNeedLayerForFunctors =
        (willHaveFunctor && functorsNeedLayer);
    return CC_LIKELY(
        effectiveLayerType() == LayerType::None) &&
        functorsNeedLayer;
}

13.25 Deep Dive: Shadows and Elevation¶

13.25.1 The Elevation Model¶

Android's Material Design elevation system creates ambient and spot shadows for Views with positive Z values:

// RenderProperties.h, line 528
bool hasShadow() const {
    return getZ() > 0.0f &&
           getOutline().getPath() != nullptr &&
           getOutline().getAlpha() != 0.0f;
}

13.25.2 Shadow Colors¶

Each RenderNode has independent shadow colors:

// RenderProperties.h, line 533
SkColor getSpotShadowColor() const {
    return mPrimitiveFields.mSpotShadowColor;
}
SkColor getAmbientShadowColor() const {
    return mPrimitiveFields.mAmbientShadowColor;
}

13.25.3 Light Source¶

The LightingInfo module maintains a global light source position used for spot shadow calculations. The light geometry is updated before each frame:

// SkiaOpenGLPipeline.cpp, line 163
SkPoint lightCenter = preTransform.mapXY(
    lightGeometry.center.x, lightGeometry.center.y);
LightGeometry localGeometry = lightGeometry;
localGeometry.center.x = lightCenter.fX;
localGeometry.center.y = lightCenter.fY;
LightingInfo::updateLighting(localGeometry, lightInfo);

13.25.4 Shadow Rendering in Skia¶

Skia renders shadows using SkShadowUtils. The shadow computation considers:

View elevation (Z translation + static elevation)
Light source position and radius
Ambient light intensity
Outline shape (rectangle, rounded rectangle, or path)

graph TD
    A["RenderNode with Z > 0"] --> B["Compute shadow params"]
    B --> C["SkShadowUtils::DrawShadow()"]
    C --> D["Ambient shadow<br/>(soft, all around)"]
    C --> E["Spot shadow<br/>(directional, below)"]
    D --> F["Composited<br/>on canvas"]
    E --> F

    style C fill:#FF9800,color:#fff

13.26 Deep Dive: The DamageAccumulator¶

13.26.1 Purpose¶

The DamageAccumulator tracks which regions of the screen need to be redrawn during a tree traversal. As prepareTree walks the RenderNode tree, each modified node reports its damage to the accumulator.

13.26.2 Transform Tracking¶

The DamageAccumulator also tracks the current transform from each node to the root, which is needed for:

Mapping node-local damage to screen coordinates
Computing the light source position relative to each layer
Determining shadow parameters

13.26.3 Damage Propagation¶

When a RenderNode property changes, the damage is propagated up through the tree:

// RenderNode.h, line 248
void damageSelf(TreeInfo& info);

If a node changes alpha, transform, or clip, its entire bounds are damaged. If only the display list content changes, only the union of old and new content bounds is damaged.

13.27 Deep Dive: Memory Policies¶

13.27.1 Memory Policy Configuration¶

The CacheManager uses a MemoryPolicy structure that defines memory behavior based on the device characteristics:

graph TD
    A["Device Boot"] --> B["loadMemoryPolicy()"]
    B --> C{"System or<br/>Persistent?"}
    C -->|Yes| D["Higher limits<br/>Longer retention"]
    C -->|No| E{"Foreground<br/>Service?"}
    E -->|Yes| F["Standard limits"]
    E -->|No| G["Lower limits<br/>Shorter retention"]

    style D fill:#4CAF50,color:#fff
    style F fill:#2196F3,color:#fff
    style G fill:#FF9800,color:#fff

13.27.2 Resource Budget Calculation¶

The GPU memory budget is derived from the screen area:

maxResourceBytes = screenWidth * screenHeight *
                   surfaceSizeMultiplier

For a 1080x2400 display with a multiplier of 8:

maxResourceBytes = 1080 * 2400 * 8 = 20,736,000 bytes (~20 MB)

13.27.3 Background Retention¶

When the app goes to the background, GPU resources are reduced to a fraction of the foreground budget:

backgroundResourceBytes = maxResourceBytes *
                          backgroundRetentionPercent

Typically 50%, so the 20MB foreground budget becomes 10MB in the background.

13.27.4 Context Destruction Timeout¶

When all CanvasContexts are stopped (all windows hidden), the CacheManager schedules the GPU context for destruction after a timeout:

// CacheManager.cpp, line 298
void CacheManager::scheduleDestroyContext() {
    if (mMemoryPolicy.contextTimeout > 0) {
        mRenderThread.queue().postDelayed(
            mMemoryPolicy.contextTimeout,
            [this, genId = mGenerationId] {
                if (mGenerationId != genId) return;
                if (!areAllContextsStopped()) return;
                mRenderThread.destroyRenderingContext();
            });
    }
}

This releases all GPU memory for fully backgrounded apps.

13.28 Deep Dive: Fence Synchronization¶

13.28.1 The Role of Fences¶

Fences are the primary synchronization mechanism in Android's graphics stack. They allow GPU work to be tracked without CPU blocking.

graph TD
    subgraph "Fence Types"
        A["EGL Fence Sync<br/>(eglCreateSyncKHR)"]
        B["Native Fence<br/>(Android sync fd)"]
        C["Vulkan Semaphore<br/>(VkSemaphore)"]
        D["Vulkan Fence<br/>(VkFence)"]
    end

    subgraph "Usage Points"
        E["Buffer release"] --> A
        E --> B
        F["Frame presentation"] --> B
        F --> C
        G["CPU wait on GPU"] --> D
        G --> A
    end

    style A fill:#4CAF50,color:#fff
    style B fill:#2196F3,color:#fff
    style C fill:#FF9800,color:#fff
    style D fill:#F44336,color:#fff

13.28.2 Native Fence Sync in EGL¶

The EglManager creates native fence file descriptors for cross-process synchronization:

// EglManager.cpp, line 732
status_t EglManager::createReleaseFence(
        bool useFenceSync, EGLSyncKHR* eglFence,
        int* nativeFence) {
    *nativeFence = -1;
    if (EglExtensions.nativeFenceSync) {
        EGLSyncKHR sync = eglCreateSyncKHR(
            mEglDisplay,
            EGL_SYNC_NATIVE_FENCE_ANDROID,
            nullptr);
        glFlush();
        int fenceFd = eglDupNativeFenceFDANDROID(
            mEglDisplay, sync);
        eglDestroySyncKHR(mEglDisplay, sync);
        *nativeFence = fenceFd;
        *eglFence = EGL_NO_SYNC_KHR;
    } else if (useFenceSync && EglExtensions.fenceSync) {
        // Fall back to EGL fence sync
        *eglFence = eglCreateSyncKHR(
            mEglDisplay, EGL_SYNC_FENCE_KHR, nullptr);
        glFlush();
    }
    return OK;
}

13.28.3 GPU-Side Fence Wait¶

The critical fenceWait method allows the GPU to wait on a fence without blocking the CPU:

// EglManager.cpp, line 689
status_t EglManager::fenceWait(int fence) {
    if (EglExtensions.waitSync && EglExtensions.nativeFenceSync) {
        // GPU-side wait: no CPU blocking
        int fenceFd = ::dup(fence);
        EGLint attribs[] = {
            EGL_SYNC_NATIVE_FENCE_FD_ANDROID, fenceFd,
            EGL_NONE
        };
        EGLSyncKHR sync = eglCreateSyncKHR(mEglDisplay,
            EGL_SYNC_NATIVE_FENCE_ANDROID, attribs);
        eglWaitSyncKHR(mEglDisplay, sync, 0);
        eglDestroySyncKHR(mEglDisplay, sync);
    } else {
        // CPU-side wait: blocks the calling thread
        sync_wait(fence, -1);
    }
    return OK;
}

The GPU-side wait is strongly preferred because it allows the CPU to continue preparing the next frame while the GPU waits for the fence to signal.

13.29 Deep Dive: Stretch and Overscroll Effects¶

13.29.1 Stretch Effect¶

Android 12 introduced a stretch/overscroll effect that deforms the content when the user scrolls past the edge. This is implemented through the StretchEffect class:

// RenderProperties.h, line 103
const StretchEffect& getStretchEffect() const {
    return mStretchEffect;
}
StretchEffect& mutableStretchEffect() {
    return mStretchEffect;
}

13.29.2 Layer Requirement for Stretch¶

The stretch effect requires a layer to apply the deformation as a post-processing step:

// RenderProperties.h, line 555
bool promotedToLayer() const {
    return mLayerProperties.mType == LayerType::None &&
           fitsOnLayer() &&
           (// ...
            mLayerProperties.getStretchEffect().requiresLayer() ||
            // ...);
}

13.29.3 StretchMask¶

The StretchMask on each RenderNode defines the region to which the stretch effect applies:

// RenderNode.h, line 130
StretchMask& getStretchMask() { return mStretchMask; }

13.30 Deep Dive: Force Dark (Dark Theme)¶

13.30.1 Automatic Dark Theme¶

HWUI includes a "force dark" mode that automatically inverts colors for apps that do not natively support dark theme:

// RenderNode.h (private methods)
void handleForceDark(TreeInfo* info);
bool shouldEnableForceDark(TreeInfo* info);
bool isForceInvertDark(TreeInfo& info);

13.30.2 Per-Node Opt-Out¶

Individual Views can opt out of force dark transformation:

// RenderProperties.h, line 564
bool setAllowForceDark(bool allow) {
    return RP_SET(mPrimitiveFields.mAllowForceDark, allow);
}
bool getAllowForceDark() const {
    return mPrimitiveFields.mAllowForceDark;
}

13.30.3 Color Transform¶

When force dark is active, the display list undergoes a color transform that inverts luminance while preserving hue:

// DisplayList.h, line 151
void applyColorTransform(ColorTransform transform) {
    if (mImpl) {
        mImpl->applyColorTransform(transform);
    }
}

13.31 Deep Dive: Hole Punching¶

13.31.1 What is Hole Punching¶

Hole punching is a technique where HWUI creates a transparent "hole" in its rendered content, allowing a hardware overlay (e.g., a video surface or camera preview) to show through:

// Canvas.h, line 154
virtual void punchHole(const SkRRect& rect, float alpha) = 0;

13.31.2 Usage in the Pipeline¶

graph TD
    A["App Window<br/>(HWUI rendered)"] --> B["Hole Punch<br/>(transparent region)"]
    B --> C["Hardware Overlay<br/>(video decoder output)"]
    C --> D["Display"]

    E["SurfaceFlinger"] --> F["App layer with hole"]
    E --> G["Video layer underneath"]
    F --> D
    G --> D

    style B fill:#FF9800,color:#fff
    style C fill:#4CAF50,color:#fff

Hole punching is tracked per-RenderNode:

// RenderNode.h, line 295
bool mHasHolePunches;

13.32 Build System Integration¶

13.32.1 HWUI Build Configuration¶

HWUI is built as part of frameworks/base and links against both Skia and the native graphics libraries. Key build targets:

libhwui -- The main HWUI shared library
hwui_unit_tests -- Native unit tests
hwui_static_deps -- Static dependency libraries

13.32.2 Skia Build Integration¶

Skia is built from external/skia/ with Android-specific build configuration that:

Enables the Ganesh GPU backend (GL and Vulkan)
Enables Android-specific SkSurface extensions
Configures SIMD optimizations for the target architecture
Excludes unused backends (Metal, Dawn, D3D)

13.32.3 Vulkan Loader Build¶

The Vulkan loader (libvulkan.so) is built from frameworks/native/vulkan/libvulkan/ with auto-generated dispatch tables from the Vulkan specification XML.

13.33 Testing Infrastructure¶

13.33.1 HWUI Tests¶

HWUI includes several test suites:

Unit tests (tests/unit/): Test individual classes like RenderNode, RenderProperties, DamageAccumulator
Rendering tests (tests/rendering/): Pixel-perfect rendering comparison tests
Macro benchmarks (tests/macrobench/): Performance benchmarks for the full rendering pipeline

13.33.2 CTS Graphics Tests¶

The Compatibility Test Suite includes extensive graphics tests:

CtsGraphicsTestCases: Tests for Canvas, Paint, Path, Bitmap
CtsUiRenderingTestCases: Tests for hardware-accelerated rendering
CtsVulkanTestCases: Vulkan CTS (based on dEQP)
CtsEglTestCases: EGL conformance tests

13.33.3 Perfetto Integration for Testing¶

HWUI's ATRACE integration enables automated performance testing:

// DrawFrameTask.cpp, line 91
ATRACE_FORMAT("DrawFrames %" PRId64, vsyncId);

// RenderThread.cpp, line 92
ATRACE_FORMAT("queue mFrameCallbackTask to run after %.2fms",
    toFloatMillis(runAt - SteadyClock::now()).count());

These trace events can be captured and analyzed in CI pipelines to detect performance regressions.

13.34 Evolution and Future Directions¶

13.34.1 Historical Pipeline Evolution¶

timeline
    title Android Graphics Pipeline Evolution
    section Early Android (1.0-2.x)
        Software rendering only : Skia CPU backend
    section Honeycomb (3.0)
        Hardware acceleration : HWUI v1 with OpenGL ES 2.0
    section Ice Cream Sandwich (4.0)
        GPU rendering default : DisplayList renderer
    section Lollipop (5.0)
        RenderThread : Async GPU rendering
    section Nougat (7.0)
        Vulkan 1.0 : New GPU API support
    section Pie (9.0)
        Skia pipeline : Replaced legacy GLES renderer
    section Android 10
        ANGLE : GL-on-Vulkan translation layer
    section Android 12
        Vulkan default : Primary render pipeline
        Stretch overscroll : New visual effect
    section Android 13+
        Graphite development : Next-gen Skia backend
        ADPF integration : Performance hints

13.34.2 Graphite Adoption Path¶

Skia's Graphite backend is being developed as the successor to Ganesh. Its adoption path for Android includes:

Feature parity with Ganesh for Android use cases
Performance validation on representative workloads
Gradual rollout behind feature flags
Eventual replacement of Ganesh in HWUI

13.34.3 Vulkan-First Strategy¶

AOSP is moving toward a Vulkan-first strategy where:

Vulkan is the default rendering API for HWUI
ANGLE provides GLES compatibility on top of Vulkan
The Vulkan driver is updatable via APEX modules
RenderEngine in SurfaceFlinger uses the Vulkan backend

This simplifies the stack by having a single GPU API path while maintaining backward compatibility through ANGLE.

13.34.4 GPU Driver Updatability¶

The APEX-based driver loading mechanism (LoadDriverFromApex in driver.cpp) enables:

Monthly GPU driver updates without OTA
Faster bug fixes for GPU-related issues
Per-device driver optimization
A/B driver testing

13.35 Deep Dive: The IRenderPipeline Interface¶

13.35.1 Pipeline Abstraction¶

The IRenderPipeline interface defines the contract that both SkiaOpenGLPipeline and SkiaVulkanPipeline implement. This interface is the abstraction boundary between the rendering logic and the GPU API:

classDiagram
    class IRenderPipeline {
        <<interface>>
        +makeCurrent() MakeCurrentResult
        +getFrame() Frame
        +draw() DrawResult
        +swapBuffers() bool
        +setSurface() bool
        +createTextureLayer() DeferredLayerUpdater*
        +onStop()
        +onContextDestroyed()
        +isSurfaceReady() bool
        +isContextReady() bool
        +flush() unique_fd
    }

    class SkiaPipeline {
        #mRenderThread : RenderThread&
        #mColorMode : ColorMode
        +renderFrame()
        +renderLayers()
    }

    class SkiaGpuPipeline {
        -mPinnedImages : vector
        +createOrUpdateLayer()
        +pinImages()
        +unpinImages()
        +getBufferSkSurface()
    }

    class SkiaOpenGLPipeline {
        -mEglManager : EglManager&
        -mEglSurface : EGLSurface
        +makeCurrent()
        +draw()
        +swapBuffers()
    }

    class SkiaVulkanPipeline {
        -mVkSurface : VulkanSurface*
        +makeCurrent()
        +draw()
        +swapBuffers()
    }

    IRenderPipeline <|-- SkiaPipeline
    SkiaPipeline <|-- SkiaGpuPipeline
    SkiaGpuPipeline <|-- SkiaOpenGLPipeline
    SkiaGpuPipeline <|-- SkiaVulkanPipeline

13.35.2 The DrawResult Structure¶

The draw result communicates timing information back to the caller:

struct DrawResult {
    bool success;            // Whether the draw succeeded
    int64_t submissionTime;  // When GPU work was submitted
    android::base::unique_fd presentFence; // Fence for presentation
};

13.35.3 Pipeline Selection Decision Tree¶

graph TD
    A["System Property<br/>debug.hwui.renderer"] --> B{"Value?"}
    B -->|"skiavk"| C["SkiaVulkan"]
    B -->|"skiagl"| D["SkiaGL"]
    B -->|"not set"| E["Default Selection"]
    E --> F{"Vulkan Driver<br/>Available?"}
    F -->|Yes| G{"Device Config<br/>Prefers Vulkan?"}
    G -->|Yes| C
    G -->|No| D
    F -->|No| D

    style C fill:#2196F3,color:#fff
    style D fill:#4CAF50,color:#fff

13.36 Deep Dive: The RenderState¶

13.36.1 Purpose¶

The RenderState tracks global rendering state on the RenderThread, including:

Active layers (for memory tracking)
Context destruction callbacks
GPU resource cleanup

13.36.2 Context Callbacks¶

Both SkiaOpenGLPipeline and SkiaVulkanPipeline register as context callbacks:

// SkiaOpenGLPipeline.cpp, line 49
SkiaOpenGLPipeline::SkiaOpenGLPipeline(RenderThread& thread)
    : SkiaGpuPipeline(thread), mEglManager(thread.eglManager()) {
    thread.renderState().registerContextCallback(this);
}

When the GPU context is destroyed (e.g., during memory trimming), all registered callbacks are notified so they can release their GPU resources.

13.36.3 Layer Tracking¶

The RenderState maintains a set of active layers for memory reporting:

// CacheManager.cpp, line 244
for (std::set<Layer*>::iterator it =
        renderState->mActiveLayers.begin();
     it != renderState->mActiveLayers.end(); it++) {
    const Layer* layer = *it;
    log.appendFormat("    %s size %dx%d\n",
        layerType, layer->getWidth(), layer->getHeight());
    layerMemoryTotal +=
        layer->getWidth() * layer->getHeight() * 4;
}

13.37 Deep Dive: Frame Timing and Jank Detection¶

13.37.1 The JankTracker¶

HWUI includes a built-in jank detector (JankTracker.h) that monitors frame timing and classifies frame drops:

graph TD
    A["Frame Completed"] --> B["JankTracker::finishFrame()"]
    B --> C{"Frame Duration<br/>> Deadline?"}
    C -->|Yes| D["Classify Jank"]
    C -->|No| E["Normal Frame"]

    D --> F{"Cause?"}
    F -->|"UI thread slow"| G["JANK_UI_THREAD"]
    F -->|"RenderThread slow"| H["JANK_RT"]
    F -->|"GPU slow"| I["JANK_GPU"]
    F -->|"Buffer stall"| J["JANK_DEQUEUE_BUFFER"]
    F -->|"Swap stall"| K["JANK_SWAP_BUFFERS"]

    style D fill:#F44336,color:#fff
    style E fill:#4CAF50,color:#fff

13.37.2 Frame Info Tracking¶

Each frame's timing is recorded in a FrameInfo array with these timestamps:

Index	Name	Thread	Description
0	IntendedVsync	UI	Target VSYNC time
1	Vsync	UI	Actual VSYNC time
2	HandleInputStart	UI	Start of input processing
3	AnimationStart	UI	Start of animations
4	PerformTraversalsStart	UI	Start of measure/layout
5	DrawStart	UI	Start of draw recording
6	SyncQueued	UI	Time sync was queued
7	SyncStart	RT	Start of sync on RenderThread
8	IssueDrawCommandsStart	RT	Start of GPU command issue
9	SwapBuffers	RT	Time of buffer swap
10	FrameCompleted	RT	Frame fully complete
11	DequeueBufferDuration	RT	Time spent dequeuing buffer
12	QueueBufferDuration	RT	Time spent queuing buffer
13	GpuCompleted	GPU	GPU work completion time
14	SwapBuffersDuration	RT	Duration of swap operation
15	FrameDeadline	-	Deadline for this frame
16	FrameStartTime	-	Frame start timestamp
17	FrameInterval	-	Expected frame interval
18	VsyncId	-	VSYNC identifier

13.37.3 GPU Profiling Visualization¶

The FrameInfoVisualizer draws colored bars on-screen showing per-frame timing:

// SkiaOpenGLPipeline.cpp, line 172
if (CC_UNLIKELY(Properties::showDirtyRegions ||
    ProfileType::None != Properties::getProfileType())) {
    std::scoped_lock lock(profilerLock);
    SkCanvas* profileCanvas = surface->getCanvas();
    SkiaProfileRenderer profileRenderer(
        profileCanvas, frame.width(), frame.height());
    profiler->draw(profileRenderer);
}

The bars are drawn directly onto the surface canvas after the main frame content, providing real-time performance visualization.

13.38 Deep Dive: The CommonPool Thread Pool¶

13.38.1 Background Work Distribution¶

HWUI uses a CommonPool thread pool for non-time-critical work:

// CacheManager.cpp, line 97
class CommonPoolExecutor : public SkExecutor {
public:
    virtual void add(std::function<void(void)> func) override {
        CommonPool::post(std::move(func));
    }
};

This pool handles:

Shader compilation on background threads
Texture upload scheduling
Deferred GPU resource cleanup
Image decoding tasks

13.38.2 Integration with Skia¶

Skia uses the executor for parallelizing internal work:

// CacheManager.cpp, line 108
contextOptions->fExecutor = &sDefaultExecutor;

This allows Ganesh to split GPU command recording work across multiple CPU threads, reducing the wall-clock time for complex frames.

13.39 Deep Dive: Bitmap Handling¶

13.39.1 Hardware Bitmaps¶

Android supports "hardware bitmaps" that are stored directly in GPU memory:

graph TD
    A["Bitmap.createBitmap()"] --> B{"Hardware<br/>Bitmap?"}
    B -->|Yes| C["AHardwareBuffer<br/>allocation"]
    C --> D["GPU texture<br/>(via Gralloc)"]
    B -->|No| E["Java heap<br/>allocation"]
    E --> F["CPU memory"]

    G["Draw bitmap"] --> H{"Source?"}
    H -->|Hardware| I["Direct texture<br/>binding (fast)"]
    H -->|CPU| J["Upload to GPU<br/>(slow first time)"]

    style C fill:#4CAF50,color:#fff
    style E fill:#FF9800,color:#fff

13.39.2 Bitmap Upload Optimization¶

SkiaGpuPipeline::prepareToDraw() pre-uploads a bitmap to GPU memory before the frame rendering phase:

// SkiaGpuPipeline.cpp, line 137
void SkiaGpuPipeline::prepareToDraw(
        const RenderThread& thread, Bitmap* bitmap) {
    GrDirectContext* context = thread.getGrContext();
    if (context && !bitmap->isHardware()) {
        ATRACE_FORMAT("Bitmap#prepareToDraw %dx%d",
            bitmap->width(), bitmap->height());
        auto image = bitmap->makeImage();
        if (image.get()) {
            skgpu::ganesh::PinAsTexture(context, image.get());
            skgpu::ganesh::UnpinTexture(context, image.get());
            context->flushAndSubmit();
        }
    }
}

The pin/unpin sequence forces the upload to happen immediately and frees the reference, but the texture remains in the GPU resource cache for later use.

13.39.3 HardwareBitmapUploader¶

The HardwareBitmapUploader class handles converting software bitmaps to hardware bitmaps. It can use either the GL or Vulkan context:

graph TD
    A["Software Bitmap"] --> B["HardwareBitmapUploader"]
    B --> C["Allocate AHardwareBuffer"]
    C --> D["Create VkImage from AHB"]
    D --> E["Copy pixel data to VkImage"]
    E --> F["Hardware Bitmap Ready"]

    style B fill:#2196F3,color:#fff
    style F fill:#4CAF50,color:#fff

13.40 Appendix: Key File Reference¶

13.40.1 OpenGL ES Stack¶

File	Path	Lines	Purpose
`eglApi.cpp`	`frameworks/native/opengl/libs/EGL/`	660	EGL API entry points
`egl.cpp`	`frameworks/native/opengl/libs/EGL/`	224	Driver initialization
`egl_platform_entries.cpp`	`frameworks/native/opengl/libs/EGL/`	~2,000	Platform EGL implementation
`Loader.cpp`	`frameworks/native/opengl/libs/EGL/`	~765	Driver loading
`MultifileBlobCache.cpp`	`frameworks/native/opengl/libs/EGL/`	~1,097	Shader cache
`egl_display.cpp`	`frameworks/native/opengl/libs/EGL/`	~600	Display management
`egl_object.cpp`	`frameworks/native/opengl/libs/EGL/`	~200	Object reference counting
`gl2.cpp`	`frameworks/native/opengl/libs/GLES2/`	~50	GLES2 trampoline

13.40.2 Vulkan Stack¶

File	Path	Lines	Purpose
`api.cpp`	`frameworks/native/vulkan/libvulkan/`	~1,484	API layer / layer management
`driver.cpp`	`frameworks/native/vulkan/libvulkan/`	~1,953	Driver loading / HAL interface
`swapchain.cpp`	`frameworks/native/vulkan/libvulkan/`	~2,000	Swapchain ↔ ANativeWindow
`layers_extensions.cpp`	`frameworks/native/vulkan/libvulkan/`	~500	Layer/extension discovery
`api_gen.cpp`	`frameworks/native/vulkan/libvulkan/`	~1,000	Generated dispatch
`driver_gen.cpp`	`frameworks/native/vulkan/libvulkan/`	~800	Generated driver dispatch
`null_driver.cpp`	`frameworks/native/vulkan/nulldrv/`	~500	Null driver for testing
`vkprofiles.cpp`	`frameworks/native/vulkan/vkprofiles/`	~200	Android baseline profiles

13.40.3 HWUI Stack¶

File	Path	Lines	Purpose
`RenderNode.h`	`frameworks/base/libs/hwui/`	452	View mirror in native
`RenderProperties.h`	`frameworks/base/libs/hwui/`	627	Visual property storage
`Canvas.h`	`frameworks/base/libs/hwui/hwui/`	298	Abstract drawing API
`SkiaCanvas.h`	`frameworks/base/libs/hwui/`	241	Skia Canvas implementation
`DisplayList.h`	`frameworks/base/libs/hwui/`	342	Command stream container
`CanvasOpTypes.h`	`frameworks/base/libs/hwui/canvas/`	75	Operation type enum
`RenderThread.cpp`	`frameworks/base/libs/hwui/renderthread/`	486	Singleton render thread
`DrawFrameTask.cpp`	`frameworks/base/libs/hwui/renderthread/`	227	Frame sync + draw task
`CanvasContext.cpp`	`frameworks/base/libs/hwui/renderthread/`	~1,000	Window rendering coordinator
`EglManager.cpp`	`frameworks/base/libs/hwui/renderthread/`	789	EGL context management
`VulkanManager.cpp`	`frameworks/base/libs/hwui/renderthread/`	~1,200	Vulkan context management
`VulkanSurface.cpp`	`frameworks/base/libs/hwui/renderthread/`	~500	Vulkan window surface
`CacheManager.cpp`	`frameworks/base/libs/hwui/renderthread/`	364	GPU memory management
`SkiaOpenGLPipeline.cpp`	`frameworks/base/libs/hwui/pipeline/skia/`	306	GL rendering pipeline
`SkiaVulkanPipeline.cpp`	`frameworks/base/libs/hwui/pipeline/skia/`	227	Vulkan rendering pipeline
`SkiaGpuPipeline.cpp`	`frameworks/base/libs/hwui/pipeline/skia/`	195	Common GPU pipeline
`RenderNodeDrawable.cpp`	`frameworks/base/libs/hwui/pipeline/skia/`	~400	Node drawing logic
`RenderProxy.cpp`	`frameworks/base/libs/hwui/renderthread/`	~300	UI thread proxy

13.40.4 System Properties Reference¶

Property	Default	Description
`debug.hwui.renderer`	(varies)	Force pipeline: `skiagl` or `skiavk`
`debug.hwui.profile`	`false`	Enable frame timing profiling
`debug.hwui.overdraw`	`false`	Show overdraw visualization
`debug.hwui.capture_skp_enabled`	`false`	Enable SKP capture
`debug.egl.callstack`	`false`	Log call stacks on EGL errors
`debug.vulkan.layers`	(empty)	Colon-separated Vulkan layers
`debug.gles.layers`	(empty)	Colon-separated GLES layers
`ro.hardware.vulkan`	(vendor)	Vulkan driver name
`ro.hardware.egl`	(vendor)	EGL driver name
`ro.vulkan.apex`	(empty)	Vulkan APEX module name
`debug.hwui.use_buffer_age`	`true`	Enable buffer age optimization
`debug.hwui.trace_gpu_resources`	`false`	Trace GPU memory
`debug.hwui.show_dirty_regions`	`false`	Flash dirty regions
`persist.sys.gpu.context_priority`	`0`	EGL context priority
`debug.hwui.disable_vsync`	`false`	Disable VSYNC synchronization
`debug.hwui.wait_for_gpu_completion`	`false`	Force GPU fence before swap

13.40.5 Mermaid: Complete Data Flow¶

This diagram summarizes the complete data flow from a View property change to a pixel on the display:

graph TD
    subgraph "Java Layer"
        A1["View.setAlpha(0.5f)"]
        A2["View.invalidate()"]
        A3["ViewRootImpl.scheduleTraversals()"]
        A4["Choreographer VSYNC callback"]
        A5["ViewRootImpl.performDraw()"]
        A6["View.updateDisplayListIfDirty()"]
        A7["RecordingCanvas.drawRect()"]
    end

    subgraph "HWUI Native (UI Thread)"
        B1["RenderNode.mutateStagingProperties()"]
        B2["Canvas.create_recording_canvas()"]
        B3["SkPictureRecorder.beginRecording()"]
        B4["SkCanvas draw operations"]
        B5["RenderNode.setStagingDisplayList()"]
    end

    subgraph "HWUI Native (RenderThread)"
        C1["DrawFrameTask.run()"]
        C2["syncFrameState()"]
        C3["RenderNode.prepareTree()"]
        C4["pushStagingPropertiesChanges()"]
        C5["pushStagingDisplayListChanges()"]
        C6["CanvasContext.draw()"]
        C7["SkiaPipeline.renderFrame()"]
        C8["RenderNodeDrawable.draw()"]
        C9["SkPicture.playback()"]
    end

    subgraph "GPU Layer"
        D1["Skia Ganesh"]
        D2["GrOpsTask batching"]
        D3["GPU command buffer"]
        D4["Shader compilation"]
        D5["GPU execution"]
    end

    subgraph "Composition Layer"
        E1["BufferQueue.queueBuffer()"]
        E2["SurfaceFlinger.onMessageInvalidate()"]
        E3["HWC.validate()"]
        E4["RenderEngine (if CLIENT)"]
        E5["HWC.present()"]
        E6["Display Controller"]
        E7["Physical Display"]
    end

    A1 --> B1
    A2 --> A3
    A3 --> A4
    A4 --> A5
    A5 --> A6
    A6 --> B2
    B2 --> B3
    B3 --> B4
    A7 --> B4
    B4 --> B5

    B5 --> C1
    C1 --> C2
    C2 --> C3
    C3 --> C4
    C3 --> C5
    C2 --> C6
    C6 --> C7
    C7 --> C8
    C8 --> C9

    C9 --> D1
    D1 --> D2
    D2 --> D3
    D3 --> D4
    D4 --> D5

    D5 --> E1
    E1 --> E2
    E2 --> E3
    E3 --> E4
    E4 --> E5
    E3 --> E5
    E5 --> E6
    E6 --> E7

    style A1 fill:#4CAF50,color:#fff
    style C1 fill:#2196F3,color:#fff
    style D1 fill:#FF9800,color:#fff
    style E2 fill:#9C27B0,color:#fff
    style E7 fill:#F44336,color:#fff

13.41 Glossary¶

Term	Definition
AHardwareBuffer	Cross-process GPU buffer handle
ANGLE	Almost Native Graphics Layer Engine (GL-on-Vulkan)
BufferQueue	Producer-consumer buffer management between app and SurfaceFlinger
CTS	Compatibility Test Suite
DamageRegion	Screen area that needs redrawing
DisplayList	Recorded canvas operation stream
EGL	Native platform interface for GPU contexts
FBO	Framebuffer Object (GL offscreen render target)
Ganesh	Skia's current production GPU backend
GLES	OpenGL for Embedded Systems
Graphite	Skia's next-generation GPU backend
Gralloc	Graphics memory allocator HAL
GrContext	Skia's GPU context object
HAL	Hardware Abstraction Layer
HWC	Hardware Composer
HWUI	Hardware UI (Android's native rendering library)
ICD	Installable Client Driver (Vulkan driver)
Jank	Visible frame drop or stutter
Layer	Offscreen render target for compositing
ProcHook	Vulkan loader function interception point
RenderEngine	SurfaceFlinger's GPU composition engine
RenderNode	Native counterpart of a Java View
RenderThread	Dedicated thread for GPU rendering in each app
SKP	Skia Picture (serialized draw command recording)
SkSL	Skia's Shading Language
SPIR-V	Standard Portable Intermediate Representation for Vulkan
SurfaceFlinger	System compositor
TLS	Thread-Local Storage
VSYNC	Vertical Synchronization signal from display
VulkanSurface	HWUI's Vulkan window surface wrapper

Summary¶

This chapter has traced Android's graphics pipeline from application code to display hardware, examining every layer in detail:

Layer	Key Files	Lines of Code
EGL/GLES Loader	`eglApi.cpp`, `egl.cpp`, `Loader.cpp`	~2,500
MultifileBlobCache	`MultifileBlobCache.cpp/.h`	~1,300
Vulkan Loader	`api.cpp`, `driver.cpp`, `swapchain.cpp`	~5,400
HWUI Core	`RenderNode.h`, `RenderProperties.h`, `Canvas.h`	~1,400
HWUI Display List	`DisplayList.h`, `CanvasOpTypes.h`	~400
RenderThread	`RenderThread.cpp`, `DrawFrameTask.cpp`	~710
EglManager	`EglManager.cpp`	~789
VulkanManager	`VulkanManager.cpp`	~1,200
CacheManager	`CacheManager.cpp`	~364
SkiaGL Pipeline	`SkiaOpenGLPipeline.cpp`	~306
SkiaVulkan Pipeline	`SkiaVulkanPipeline.cpp`	~227
Skia (external)	`src/gpu/ganesh/`, `include/core/`	~500,000+

The architecture reflects decades of evolution:

Android 1.0-2.x: Software rendering only
Android 3.0: Hardware-accelerated rendering introduced (HWUI v1)
Android 4.0: GPU rendering default for all apps
Android 5.0: RenderThread added for async GPU work
Android 7.0: Vulkan 1.0 support
Android 9.0: Skia-based pipeline (replacing legacy OpenGL display list renderer)
Android 10.0: ANGLE integration for GL-on-Vulkan
Android 12.0: Vulkan as default render pipeline on supported devices
Android 13.0+: Skia Graphite backend development begins

The key design principle throughout is separation of concerns with minimal cross-thread synchronization. The UI thread records, the RenderThread renders, SurfaceFlinger composes, and HWC presents -- each with well-defined handoff points and fence-based synchronization rather than locks.