|=----------------------------------------------------------------------------=|
|=-----------------------------=[ Paper Review ]=-----------------------------=|
|=---=[ USBFuzz: A Framework for Fuzzing USB Drivers by Device Emulation ]=---=|
|=----------------------------------------------------------------------------=|
|=-----------------------=[ Mon 26 Oct 2020 21:07:17 ]=-----------------------=|
|=-------------------=[ https://qiuhao.org/PR_USBFuzz.txt ]=------------------=|
|=----------------------------------------------------------------------------=|

--[ 1 - Introduction

During the development of device drivers, people usually think the devices
are trusted. Nevertheless, for mobile and hot-plugging peripherals like USB
devices, things can be different. So there exists a threat model in which
adversaries attack a computer through the USB interface to leverage
vulnerabilities in the IO stack.

To find those vulnerabilities before they being exploited, fuzzing is a
practical technology we may use. However, fuzzing drivers from lower level
interfaces is difficult since programmable hardware devices are expensive and
hard to scale them to many USB targets. Also, the attach and detach actions is
slow on the physical platform. For solutions that adapt the kernel like the
syzkaller and PeriScope, they need to patch a particular OS, so require a deep
understanding of the kernel and lack portability. Also, they fuzz at an
individual layer of the IO stack. Therefore some code paths cannot be tested.
Finally, solutions like vUSBf do not use the execution information like
coverage to guide the fuzzing process, so performance and accuracy may suck.

This paper presents USBFuzz, a portable, flexible, and modular fuzzing
framework consists of a fuzzer, emulated devices, OS running on a
virtualization platform. It can be customed to fuzz the USB subsystem and
other peripherals in different kernels and leverage the coverage-feedback
information on a patched Linux kernel.

--[ 2 - Design & Implementation

To accomplish three goals: low cost, portability, and minimal required
knowledge, USBFuzz is designed to fuzz the driver from a particular emulated
device whose data comes from an outside fuzzer. For synchronization and status
information between each fuzz iteration, the authors also add a PCI device in
the virtualization platform. In all, the architecture of USBFuzz looks like:

                                            QEMU
                            +-----------------------------------+
                            |                                   |
                            |   User Mode Agent          +      |
                            +-----------------------------------+
                            |                            |      |
                            |    Target Driver        +  |  +   |
                            +------^----------------------------+
                            |      |                  |  |  |   |
                            |   +--+--------+         |  |  |   |
                            |   | USB Device|         |  |  |   |
                            |   +--+--------+         |  |  |   |
                            |      ^                  |  |  |   |
                            |      |                  |  |  |   |
+-----------+ Bitmap/Corpus |   +--+--------+ Feedback|  |  |   |
|           +<----------------->+           +<--------+  |  |   |
|           |  Status Pipe  |   |  IVSHMEM  |            |  |   |
|   AFL     +<------------------+           +<-----------+  |   |
|  Fuzzer   |  Control Pipe |   |           |  Status       |   |
|           +------------------>+           |               |   |
|           |               |   +-----------+         ^ MMIO|   |
|           |               |                     IRQ |  PIO|   |
+-----------+               +-----------------------------------+
                                                      |     |
+-----------------------------------------------------+-----v---+
|                Host Kernel                      KVM           |
+---------------------------------------------------------------+

The fuzzer maps its coverage bitmap and corpus output in a shared memory
region. The PCI device then shares this region directly into VM's virtual
address space, so there is no memory copy cost. After the fuzzer and QEMU
finish their initializations, the agent in VM user mode will trigger the
fuzzing using the status channel. Fuzzer will also notice the VM when it
finishes generating a test input using the control channel.

It should be noted that USBFuzz ignores almost all write operations from
drivers, so there is no arrow from the target driver to the USB device.

To implement this design, the authors extend several open-source projects.
QEMU: add the emulated USB device for fuzzing feed and the IVSHMEM device
with two callbacks for communication. AFL: add export capacity to expose its
shared memory and control and status communication functions. The userspace
agent is implemented from scratch.

The fuzzing is most about the attaching process (with the agent's help in
userspace, we can also do a focused fuzzing on one driver) when the USB
driver framework reads the device descriptors and configuration descriptors
and uses the appropriate driver to interact with it. However, these
descriptors will usually be read multiple times and relatively slow, so
USBFuzz handles those read operations separately with cached content.

To improve the fuzzing performance, USBFuzz uses the in-memory (or persistent
mode) method to use a prepared kernel snapshot before each fuzzing iterations.

One interesting thing is that the userspace agent detects the oops in drivers
with an empirical method --- it scans the kernel logs (/dev/kmsg in Linux)!
For any warning and error messages, a report will be sent to fuzzer using the
status channel. And of course, we should modify the kernel's parameters
before and enable the sanitizers like ASan in the compiler's flags.

The Linux coverage collection tool kcov aims to collect coverage of the
function of syscall inputs. So for soft/hard interrupts and instrumentation
of some inherently non-deterministic or non-interesting parts of the kernel
(e.g., scheduler, locking), it does not work. To accomplish our fuzzing
goals, the authors extend kcov by calculating the bitmap index with current
combines previous execute location:

  index = ( hash ( IP ) ^ hash ( prev_loc ))% BITMAP_SIZE;
  bitmap [ index ] ++;
  prev_loc = IP ;

When an exception occurs, USBFuss will save the IP in the struct task in
process context or the kernel stack in an interrupt context (e.g., nested
interrupt). After the context is restored, the coverage detection is running
integrally.

--[ 3 - Evaluation

Hardware and software: A cluster contains four nodes running Ubuntu 16.04 LTS
with a KVM. Each node is equipped with 32GB of memory and an Intel i7-6700K
processor.

For coverage-guide fuzzing, the USBFuzz made to find 47 unique bugs on nine
recent versions of the Linux kernel after four weeks, reaching about 2.8
million executions. Since it leverages the emulated device to fuzz, it is
easy to port it to the closed-source OS like Windows and MacOS, using
cross-pollination seeds, where it finds several bugs, leading to DOS.

Compared with the extension usb-fuzzer of syzkaller, USBFuzz does not show
advantages on coverage (btw, both are under 5 percent) and the number of
vulnerabilities found. The authors think this is because of the manual
analysis of the kernel code and custom-tailoring the individual generated USB
messages to the different USB drivers and protocols. However, the syzkaller
cannot cover the host controller driver since it uses a USB gadget and a
software host controller (dummy_hcd.c). Meanwhile, USBFuzz cannot cover the
gadget subsystem of Linux.

The fuzzing rate is relatively low ( 0.1–2.6 exec/sec) for both USBFuzz and
the syzkaller. The authors blame this on the low-speed recognizing process in
Linux, taking about four seconds to recognize a USB flash drive on a physical
machine fully. Nevertheless, the authors do not propose a solution to this.

In the end, USBFuzz has also been changed to fuzz the SD card drivers to show
it is extensible and find nothing, though.

--[ 4 - Authors' future work

As mentioned above, the syzkaller is manually engineered to adopt standards
but can fail to find non-standard compliant bugs. So we can combine these two
engines and share seeds between them.

The overhead of attaching/detaching operations accounts for about 50% of
short tests' total execution time (fail to pass initial checks). Cache the
emulated device and perform only necessary initialization operations may help
eliminate the overhead of attach/detach operations.

--[ 5 - Critical thoughts

Offensive security vs. Defensive security: "The best alternative to defense
mechanisms is to find and fix the bugs." --- The authors. "ACTUAL effective
improvements to security come from building mitigations to kill entire
classes of vulns, not bug hunting." --- Nate Warfield (Defender of 1st
responders | Security Research @ MS)

Value of these vulnerabilities: The threat model is somewhat worthless for
adversaries since most servers disable their USB interfaces in UEFI or do not
load corresponding drivers while booting the kernel. Furthermore, adversaries
need either a physical connection to the victim's PC unless he uses USB in
remote mode (USBIP, usbdir). Both are rare and difficult to exploit.

Fuzzing throughput: the fuzzing speed is too slow, but the authors just
breakdown it to the long-term path when the test input passes initial sanity
checks and being bound to a driver, did not propose any solution.

The author did not clarify whether they leverage the parallel computing
architecture, like sharing corpus between nodes.

--[ 6 - What we learned

Fuzzing from *all directions/dimensions* and fuzz to death ;)

Computing power, especially the CPU/Cache and RAM, is vital for fuzzing.
(money talks!)

--[ 7 - Can we do better?

Should we use the write operations from the drivers also? Especially the control
or notice messages.

Can we merge the control pipe and status pipe in the IVSHMEM since it
supports interrupts between the host and client?

"on Windows and MacOS, due to the lack of a clear signal from the kernel when
devices are attached/detached, our user mode agent uses a fixed timeout (1
second on MacOS and 5 seconds on Windows) to let the device properly
initialize." --- Could we move the agent into the qemu to know when the
driver notifies the fuzzing device (again, do not ignore the write
operations) that the OS has succeeded to configure the device?

IMHO, the low throughput may blame the deliberate time delays in drivers to
confirm the hardware specifications. So we may improve the performance by
patching these delays.

Can we do better in getting the executive information on closed source OS
with hardware tracing tools (like Intel PIN) or other emulation techs (Binary
translation, 26th USENIX Security - 360's digtool)?