m4gnum's blog

Just some random thoughts about reversing, exploitation, OS internals, hypervisors, compilers and platform security.

13 April 2022

Why would legacy BIOS support multithreading?

by m4gnum

Last week, I dove into the implementation of the floppy driver in SeaBIOS (an open-source legacy BIOS implementation used by QEMU and KVM). I started by reading the implementation of the sector reading routine. Turns out that SeaBIOS uses a common floppy_dma_cmd routine for all these operations (reading, writing and formatting). This routine does the following:

  1. Sets up the DMA controller (mainly, writes the buffer physical address to PORT_DMA_ADDR_2 and the buffer size minus one to PORT_DMA_CNT_2).
  2. Invokes the floppy controller via port I/O (starts the motor, waits for it to get up to speed, writes the command into PORT_FD_DATA, waits for a success IRQ to arrive, and reads the return values from PORT_FD_DATA).
  3. Populates the floppy_return_status array of the BDA (BIOS data area) with the return values read from PORT_FD_DATA.
  4. Checks these return values for errors and maps them to the relevant return status.

One specific interesting thing that made me wonder is the IRQ waiting part, that makes this DMA-based implementation which is asynchronous, match the synchronous nature of boot sectors:

// Wait for command to complete.
if (command & FCF_WAITIRQ) {
    int ret = floppy_wait_irq();
    if (ret)
        return ret;
}

Initially, I thought this would be a very naive implementation - just a busy wait on a flag that gets set in the interrupt handler of INT #0x0E (#PF exception, but also the floppy controller interrupt in the default mapping of the 8259 PIC)… Turns out that this is the implementation:

static int
floppy_wait_irq(void)
{
    u8 frs = GET_BDA(floppy_recalibration_status);
    SET_BDA(floppy_recalibration_status, frs & ~FRS_IRQ);
    u32 end = timer_calc(FLOPPY_IRQ_TIMEOUT);
    for (;;) {
        if (timer_check(end)) {
            warn_timeout();
            floppy_disable_controller();
            return DISK_RET_ETIMEOUT;
        }
        frs = GET_BDA(floppy_recalibration_status);
        if (frs & FRS_IRQ)
            break;
        // Could use yield_toirq() here, but that causes issues on
        // bochs, so use yield() instead.
        yield();
    }
    SET_BDA(floppy_recalibration_status, frs & ~FRS_IRQ);
    return DISK_RET_SUCCESS;
}

Well, this seems pretty similar to my assumption - the flag is the FRS_IRQ bit of the floppy_recalibration_status of the BDA, and support for timeouts is also included. But, what is that call to yield()? Let’s look at it:

// Briefly permit irqs to occur.
void
yield(void)
{
    if (MODESEGMENT || !CONFIG_THREADS) {
        check_irqs();
        return;
    }
    struct thread_info *cur = getCurThread();
    // Switch to the next thread
    switch_next(cur);
    if (cur == &MainThread)
        // Permit irqs to fire
        check_irqs();
}

What the hack? So you’re telling me a legacy BIOS supports multithreading? Why would it ever be needed?

To answer this question, I decided to try to understand which threads are created, except for the main thread. It seems that the function responsible for starting a thread is called run_thread, and is located inside stacks.c, which contains all multithreading utilities. Let’s list down all the calls to run_thread:

src/hw/ahci.c:        run_thread(ahci_port_detect, port);
src/hw/ata.c:    run_thread(ata_detect, chan_gf);
src/hw/esp-scsi.c:        run_thread(init_esp_scsi, pci);
src/hw/lsi-scsi.c:        run_thread(init_lsi_scsi, pci);
src/hw/megasas.c:            run_thread(init_megasas, pci);
src/hw/mpt-scsi.c:            run_thread(init_mpt_scsi, pci);
src/hw/nvme.c:        run_thread(nvme_controller_setup, pci);
src/hw/ps2port.c:    run_thread(ps2_keyboard_setup, NULL);
src/hw/pvscsi.c:        run_thread(init_pvscsi, pci);
src/hw/sdcard.c:        run_thread(sdcard_romfile_setup, file);
src/hw/sdcard.c:        run_thread(sdcard_pci_setup, pci);
src/hw/usb.c:        run_thread(usb_hub_port_setup, usbdev);
src/hw/usb-ehci.c:    run_thread(configure_ehci, cntl);
src/hw/usb-ohci.c:    run_thread(configure_ohci, cntl);
src/hw/usb-uhci.c:    run_thread(configure_uhci, cntl);
src/hw/usb-xhci.c:    run_thread(configure_xhci, xhci);
src/hw/usb-xhci.c:    run_thread(configure_xhci, xhci);
src/hw/virtio-blk.c:        run_thread(init_virtio_blk, pci);
src/hw/virtio-mmio.c:        run_thread(init_virtio_blk_mmio, mmio);
src/hw/virtio-mmio.c:        run_thread(init_virtio_scsi_mmio, mmio);
src/hw/virtio-scsi.c:        run_thread(init_virtio_scsi, pci);

Interestingly, these all seem to be routines related to device setup. Looking at the calls to them (and to their callers, recursively), all roads seem to lead to the function device_hardware_setup of post.c (the source file containing all Power-on Self-test phase logic):

void
device_hardware_setup(void)
{
    usb_setup();
    ps2port_setup();
    block_setup();
    lpt_setup();
    serial_setup();
    cbfs_payload_setup();
}

This is just an “orchestrator” function that calls all device setup routines. Let’s look at the function maininit (the main BIOS setup routine) which calls it, and might help us understand how and why multithreading is used during device setup:

static void
maininit(void)
{
    // Initialize internal interfaces.
    interface_init();

    // Setup platform devices.
    platform_hardware_setup();

    // Start hardware initialization (if threads allowed during optionroms)
    if (threads_during_optionroms())
        device_hardware_setup();

    // Run vga option rom
    vgarom_setup();
    sercon_setup();
    enable_vga_console();

    // Do hardware initialization (if running synchronously)
    if (!threads_during_optionroms()) {
        device_hardware_setup();
        wait_threads();
    }

    // Run option roms
    optionrom_setup();

    // Allow user to modify overall boot order.
    interactive_bootmenu();
    wait_threads();

    // Prepare for boot.
    prepareboot();

    // Write protect bios memory.
    make_bios_readonly();

    // Invoke int 19 to start boot process.
    startBoot();
}

The comments here are very interesting. It seems that there is a check for support for multithreading during what’s called option ROM setup. If that’s indeed supported, device initialization is started very early, and there is a call to wait_threads only after the option ROM setup. Otherwise, device initialization is started after VGA option ROM setup and before the setup of the other option ROMs. In addition, right after calling device_hardware_setup, wait_threads is called as well. Thus, we can understand that one use of multithreading in legacy BIOS is allowing device setup to be done in parallel to OROM setup. Just as a quick note, option ROM is firmware that is stored in an expansion card usually, which gets executed by the main BIOS and initializes non-PnP devices, while also optionally adding support for them to the BIOS. This is critical for supporting non-PnP devices in the same way that standard peripherals are supported.

I’d like to note that even without enabling multithreading during OROM setup, threads are still useful since device initialization is done in parallel anyway. For example, let’s take a look at nvme_controller_setup, which is another routine that is called via run_thread. Internally, it calls nvme_controller_enable, that does the following after enabling the NVMe controller:

if (nvme_wait_csts_rdy(ctrl, 1)) {
    dprintf(2, "NVMe fatal error while enabling controller\n");
    goto err_destroy_admin_sq;
}

Let’s look at how nvme_wait_csts_rdy works:

static int
nvme_wait_csts_rdy(struct nvme_ctrl *ctrl, unsigned rdy)
{
    u32 const max_to = 500 /* ms */ * ((ctrl->reg->cap >> 24) & 0xFFU);
    u32 to = timer_calc(max_to);
    u32 csts;

    while (rdy != ((csts = ctrl->reg->csts) & NVME_CSTS_RDY)) {
        yield();

        if (csts & NVME_CSTS_FATAL) {
            dprintf(3, "NVMe fatal error during controller shutdown\n");
            return -1;
        }

        if (timer_check(to)) {
            warn_timeout();
            return -1;
        }
    }

    return 0;
}

Cool, we’ve found another call to yield! This tells us that during NVMe initialization, while waiting for the NVMe controller to report that it has been successfully enabled (or disabled), we can work on the initialization of another device. I’m not sure how significant is this optimization, but it is definitely really cool, probably reduces boot time, and is much more advanced than what I originally thought about legacy BIOS. Let’s see how frequently is that used using a basic grep command:

$ grep "yield();" src/hw/* | cut -d ":" -f 1 | uniq -c
      5 src/hw/ahci.c
      3 src/hw/ata.c
      3 src/hw/floppy.c
      2 src/hw/megasas.c
      2 src/hw/nvme.c
      3 src/hw/ps2port.c
      1 src/hw/rtc.c
      1 src/hw/sdcard.c
      1 src/hw/timer.c
      2 src/hw/tpm_drivers.c
      1 src/hw/usb.c
      5 src/hw/usb-ehci.c
      3 src/hw/usb-ohci.c
      3 src/hw/usb-uhci.c
      3 src/hw/usb-xhci.c

Although this check is very rough (it also matches calls to yield() not during device setup), it seems like this optimization is used pretty widely in SeaBIOS’ codebase.

The only question I’ve still got open is whether this is cooperative or preemptive multithreading. Looking a bit at stacks.c again, I found a list of functions ending with the _preempt suffix:

So, isn’t the answer obvious? start_preempt turns on RTC (real-time clock) IRQs, in which check_preempt is called. Looks like classic preemptive multithreading. However, looking again at yield(), I’ve noticed the following comment:

// Briefly permit irqs to occur.
void
yield(void)
{
	// [...]
}

It is noted that yield briefly permits IRQs to occure, in addition to switching to the next thread. This is done using the following inline assembly snippet which is a part of check_irqs:

asm volatile("sti ; nop ; rep ; nop ; cli ; cld" : : :"memory");

Thus, it doesn’t really matter that start_preempt enables RTC IRQs, as these will probably be abandoned, since 99% of the time, interrupts are disabled (take a look at the cli instruction above). However, it turns out that wait_preempt is actually still used. Take a look at the following function, which enables DMA support for a PCI device, and is called during almost every PCI device initialization phase:

void
pci_enable_busmaster(struct pci_device *pci)
{
    wait_preempt();
    pci_config_maskw(pci->bdf, PCI_COMMAND, 0, PCI_COMMAND_MASTER);
    pci->have_driver = 1;
}

Look! It calls wait_preempt() before doing anything! Turns out that pci_enable_iobar and pci_enable_membar behave the same way, and this is probably the main trigger for thread switching in this cooperative multithreading scheme.

So, in conclusion, we’ve learned a very surprising fact - SeaBIOS uses cooperative multithreading during early device setup to reduce boot times. This is useful due to the asynchronous nature of the initialization of some peripherals. The funny thing is that while writing this blogpost, I’ve searched the codebase for yield() to copy code snippets, and ran into the following block of documentation:

Threads
=======

Internally SeaBIOS implements a simple cooperative multi-tasking
system. The system works by giving each "thread" its own stack, and
the system round-robins between these stacks whenever a thread issues
a yield() call. This "threading" system may be more appropriately
described as [coroutines](http://en.wikipedia.org/wiki/Coroutine).
These "threads" do not run on multiple CPUs and are not preempted, so
atomic memory accesses and complex locking is not required.

The goal of these threads is to reduce overall boot time by
parallelizing hardware delays. (For example, by allowing the wait for
an ATA hard drive to spin-up and respond to commands to occur in
parallel with the wait for a PS/2 keyboard to respond to a setup
command.) These hardware setup threads are only available during the
"setup" sub-phase of the [POST phase](#POST_phase).

The code that implements threads is in stacks.c.

I probably should have checked if a documentation had existed before diving into the implementation of multithreading in SeaBIOS… Well, at least it was fun :)

tags: bios - firmware