No description
Find a file
ZhaoLiu 2aaf7ad9fc x86: Support Host exposes CPU topology to Guest VM
At present the Guest generates its own CPU topology. To mitigate
cross-hyperthread speculative execution side channel attacks, allow
Guest to use mirror CPU topology of Host is needed for future scheduling
optimization.

Add a config option "--host-cpu-topology" to ask the vCPU number to be
identical to physical CPUs, and make the vCPU has the same APIC ID in
MADT and CPUID as the corresponding physical CPU. The same APIC ID can
ensure the same topology.

"--host-cpu-topology" requires vCPU number must equal to pCPU number,
and it has the default vCPU number setting, which equals to pCPU number.

"--host-cpu-topology" also defaultly sets CPU affinity for each vCPU to
the pCPU which has the same processor ID, like 1=1:2=2:3=3:4=4, so that
the vCPU and its corresponding pCPU will have the same processor ID and
the same APIC ID. User can't set CPU affinity if "--host-cpu-topology"
was set.

BUG=b:197875305
TEST=Set "--host-cpu-topology" option and check Guest's /proc/cpuinfo,
  lscpu, CPUID for different vCPU

Change-Id: Ibc4eb10649e89f43b81bde6d46d6e0e6c7234324
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/3217035
Tested-by: kokoro <noreply+kokoro@google.com>
Commit-Queue: Chirantan Ekbote <chirantan@chromium.org>
Reviewed-by: Chirantan Ekbote <chirantan@chromium.org>
2021-10-25 04:19:45 +00:00
.cargo Add test runner ./tools/run_tests 2021-10-15 22:12:03 +00:00
.devcontainer Add new dev container and install-deps scripts 2021-10-15 22:12:02 +00:00
.github/workflows mdbook: Generate cargo-doc by GitHub Actions 2021-10-06 15:10:23 +00:00
aarch64 x86: Support Host exposes CPU topology to Guest VM 2021-10-25 04:19:45 +00:00
acpi_tables crosvm: switch to upstream tempfile crate 2021-10-11 18:35:55 +00:00
arch x86: Support Host exposes CPU topology to Guest VM 2021-10-25 04:19:45 +00:00
assertions
audio_streams Reland "Add StreamEffect to new(_async)_capture_stream" 2021-10-14 14:31:33 +00:00
base Revert "vm_memory: Add from_desciptor() in MemoryMappingBuilder" 2021-09-27 16:36:24 +00:00
bin Add presubmit script 2021-10-18 20:10:41 +00:00
bit_field crosvm: fix needless_borrow clippy warning 2021-08-25 23:02:23 +00:00
ci/kokoro ci: kokoro: push-to-github: Use git push --all instead of --mirror 2021-10-20 16:06:04 +00:00
common Add newlines to end of Cargo.toml files. 2021-08-17 20:20:41 +00:00
cros_async crosvm: switch to upstream tempfile crate 2021-10-11 18:35:55 +00:00
crosvm_plugin Fix clippy warnings and Cargo.lock 2021-07-15 03:33:17 +00:00
data_model data_model: convert to ThisError and sort 2021-09-02 20:59:14 +00:00
devices devices: Use Mutex to protect bus->devices 2021-10-21 05:42:49 +00:00
disk Reland "Dump backing File system type." 2021-10-11 23:10:07 +00:00
docs/book Add onboarding resources to mdbook 2021-10-21 19:32:14 +00:00
enumn
fuse fuse: convert to ThisError and sort 2021-09-02 21:00:21 +00:00
fuzz crosvm: switch to upstream tempfile crate 2021-10-11 18:35:55 +00:00
gpu_display Revert "vm_memory: Add from_desciptor() in MemoryMappingBuilder" 2021-09-27 16:36:24 +00:00
hypervisor Revert "vm_memory: Add from_desciptor() in MemoryMappingBuilder" 2021-09-27 16:36:24 +00:00
integration_tests crosvm: switch to upstream tempfile crate 2021-10-11 18:35:55 +00:00
io_uring crosvm: switch to upstream tempfile crate 2021-10-11 18:35:55 +00:00
kernel_cmdline kernel_cmdline: convert to ThisError and sort 2021-09-02 21:00:23 +00:00
kernel_loader crosvm: switch to upstream tempfile crate 2021-10-11 18:35:55 +00:00
kvm Revert "vm_memory: Add from_desciptor() in MemoryMappingBuilder" 2021-09-27 16:36:24 +00:00
kvm_sys kvm_sys: Update aarch64 bindings.rs 2021-05-22 19:23:01 +00:00
libcras_stub Integrate audio_streams into crosvm, add stub libcras implementation 2021-07-29 05:59:42 +00:00
libcrosvm_control Add FFI library providing control socket access 2021-04-08 00:20:01 +00:00
libvda virtio: video: Add support for dynamically changing the peak bitrate. 2021-08-03 00:48:45 +00:00
linux_input_sys gpu_display/wayland: Added keyboard and pointing devices 2021-06-15 03:14:07 +00:00
net_sys
net_util net_util: convert to ThisError and sort 2021-09-24 21:13:17 +00:00
power_monitor power_monitor: convert to ThisError and sort 2021-09-24 21:13:19 +00:00
protos Remove trunks proto from crosvm build 2021-07-31 03:01:21 +00:00
qcow_utils disk: limit maximum nesting depth 2021-09-17 02:55:04 +00:00
resources resources: Introduce new allocator for VFIO platform device 2021-10-04 16:32:47 +00:00
rutabaga_gfx rutabaga_gfx: make certain types usable across FFI boundaries 2021-10-21 22:04:41 +00:00
seccomp seccomp: gpu: allow vulkan loader/layer and virglrenderer 2021-10-20 07:20:27 +00:00
src x86: Support Host exposes CPU topology to Guest VM 2021-10-25 04:19:45 +00:00
sync Revert "sync: Add wait_while variants to condvar wrapper" 2021-06-30 04:23:47 +00:00
sys_util sys_util: Add the interface to get CPU affinity 2021-10-25 04:19:42 +00:00
system_api_stub system_api_stub: use 2018 edition of Rust 2021-09-09 06:42:46 +00:00
tests tests/plugins: replace rand_ish use with a counter 2021-07-21 23:28:27 +00:00
third_party Uprev all submodules 2021-10-19 19:12:53 +00:00
tools Uprev to rust 1.55.0 2021-10-19 19:12:52 +00:00
tpm2
tpm2-sys Uprev all submodules 2021-10-19 19:12:53 +00:00
usb_sys
usb_util usb_util: convert to ThisError and sort 2021-10-06 21:38:30 +00:00
vfio_sys devices: vfio: add support for VFIO_REGION_INFO_CAP_MSIX_MAPPABLE 2021-08-13 23:24:01 +00:00
vhost Add vhost-user vsock device 2021-10-08 07:00:04 +00:00
virtio_sys base: First steps towards universal RawDescriptor 2020-10-31 07:12:34 +00:00
vm_control vm_control: convert to ThisError and sort 2021-10-06 21:38:32 +00:00
vm_memory vm_memory: convert to ThisError and sort 2021-10-06 21:38:33 +00:00
x86_64 x86: Support Host exposes CPU topology to Guest VM 2021-10-25 04:19:45 +00:00
.dockerignore
.gitignore Add test runner ./tools/run_tests 2021-10-15 22:12:03 +00:00
.gitmodules Switch to submodules based workflow 2021-08-05 18:32:32 +00:00
.rustfmt.toml rustfmt.toml: Use 2018 edition 2021-02-10 11:54:06 +00:00
ARCHITECTURE.md ARCHITECTURE: Update code map 2021-09-30 11:44:32 +00:00
Cargo.toml Add test runner ./tools/run_tests 2021-10-15 22:12:03 +00:00
CONTRIBUTING.md docs: Move code map to ARCHITECTURE 2021-09-29 16:19:51 +00:00
LICENSE
navbar.md
OWNERS OWNERS: Remove zachr, change denniskempin to google.com 2021-06-28 22:33:11 +00:00
README.md README: Fix typos in proper names and script paths 2021-10-21 15:26:45 +00:00
run_tests crosvm: switch to upstream tempfile crate 2021-10-11 18:35:55 +00:00
rust-toolchain Uprev to rust 1.55.0 2021-10-19 19:12:52 +00:00
setup_cros_cargo.sh fs: Support setting quota project ID 2021-09-03 00:47:25 +00:00
test_all Move virglrenderer/minigbm build into build.rs 2021-09-09 23:13:24 +00:00
unblocked_terms.txt unblocked_terms.txt: clean up trivial cases 2021-04-26 20:32:38 +00:00

crosvm - The Chrome OS Virtual Machine Monitor

This component, known as crosvm, runs untrusted operating systems along with virtualized devices. This only runs VMs through the Linux's KVM interface. What makes crosvm unique is a focus on safety within the programming language and a sandbox around the virtual devices to protect the kernel from attack in case of an exploit in the devices.

[TOC]

Building for Linux

Setting up the development environment

Crosvm uses submodules to manage external dependencies. Initialize them via:

git submodule update --init

It is recommended to enable automatic recursive operations to keep the submodules in sync with the main repository (But do not push them, as that can conflict with repo):

git config --global submodule.recurse true
git config push.recurseSubmodules no

Crosvm development best works on Debian derivatives. We provide a script to install the necessary packages on Debian:

$ ./tools/install-deps

For other systems, please see below for instructions on Using the development container.

Setting up for cross-compilation

Crosvm is built and tested on x86, aarch64 and armhf. Your host needs to be set up to allow installation of foreign architecture packages.

On Debian this is as easy as:

$ sudo dpkg --add-architecture arm64
$ sudo dpkg --add-architecture armhf
$ sudo apt update

On ubuntu this is a little harder and needs some manual modifications of APT sources.

For other systems (including gLinux), please see below for instructions on Using the development container.

With that enabled, the following scripts will install the needed packages:

$ ./tools/install-aarch64-deps
$ ./tools/install-armhf-deps

Using the development container

We provide a Debian container with the required packages installed. With Docker installed, it can be started with:

$ ./tools/dev_container

The container image is big and may take a while to download when first used. Once started, you can follow all instructions in this document within the container shell.

Development

Iterative development

You can use cargo as usual for crosvm development to cargo build and cargo test single crates that you are working on.

If you are working on aarch64 specific code, you can use the set_test_target tool to instruct cargo to build for aarch64 and run tests on a VM:

$ ./tools/set_test_target vm:aarch64 && source .envrc
$ cd mycrate && cargo test

The script will start a VM for testing and write environment variables for cargo to .envrc. With those cargo build will build for aarch64 and cargo test will run tests inside the VM.

The aarch64 VM can be managed with the ./tools/aarch64vm script.

Running all tests

Crosvm cannot use cargo test --workspace because of various restrictions of cargo. So we have our own test runner:

$ ./tools/run_tests

Which will run all tests locally. Since we have some architecture-dependent code, we also have the option of running tests within an aarch64 VM:

$ ./tools/run_tests --target=vm:aarch64

When working on a machine that does not support cross-compilation (e.g. gLinux), you can use the dev container to build and run the tests.

$ ./tools/dev_container ./tools/run_tests --target=vm:aarch64

Note however, that using an interactive shell in the container is preferred, as the build artifacts are not preserved between calls:

$ ./tools/dev_container
crosvm_dev$ ./tools/run_tests --target=vm:aarch64

It is also possible to run tests on a remote machine via ssh. The target architecture is automatically detected:

$ ./tools/run_tests --target=ssh:hostname

However, it is your responsibility to make sure the required libraries for crosvm are installed and password-less authentication is set up. See ./tools/impl/testvm/cloud_init.yaml for hints on what the VM has installed.

Presubmit checks

To verify changes before submitting, use the presubmit script:

$ ./tools/presubmit

or

$ ./tools/presubmit --quick

This will run clippy, formatters and runs all tests. The --quick variant will skip some slower checks, like building for other platforms.

Known issues

  • By default, crosvm is running devices in sandboxed mode, which requires seccomp policy files to be set up. For local testing it is often easier to --disable-sandbox to run everything in a single process.
  • If your Linux header files are too old, you may find minijail rejecting seccomp filters for containing unknown syscalls. You can try removing the offending lines from the filter file, or add --seccomp-log-failures to the crosvm command line to turn these into warnings. Note that this option will also stop minijail from killing processes that violate the seccomp rule, making the sandboxing much less aggressive.
  • Seccomp policy files have hardcoded absolute paths. You can either fix up the paths locally, or set up an awesome hacky symlink: sudo mkdir /usr/share/policy && sudo ln -s /path/to/crosvm/seccomp/x86_64 /usr/share/policy/crosvm. We'll eventually build the precompiled policies into the crosvm binary.
  • Devices can't be jailed if /var/empty doesn't exist. sudo mkdir -p /var/empty to work around this for now.
  • You need read/write permissions for /dev/kvm to run tests or other crosvm instances. Usually it's owned by the kvm group, so sudo usermod -a -G kvm $USER and then log out and back in again to fix this.
  • Some other features (networking) require CAP_NET_ADMIN so those usually need to be run as root.

Building for ChromeOS

crosvm is included in the ChromeOS source tree at src/platform/crosvm. Crosvm can be built with ChromeOS features using Portage or cargo.

If ChromeOS-specific features are not needed, or you want to run the full test suite of crosvm, the Building for Linux and Running crosvm tests workflows can be used from the crosvm repository of ChromeOS as well.

Using Portage

crosvm on ChromeOS is usually built with Portage, so it follows the same general workflow as any cros_workon package. The full package name is chromeos-base/crosvm.

See the Chromium OS developer guide for more on how to build and deploy with Portage.

NOTE: cros_workon_make modifies crosvm's Cargo.toml and Cargo.lock. Please be careful not to commit the changes. Moreover, with the changes cargo will fail to build and clippy preupload check will fail.

Using Cargo

Since development using portage can be slow, it's possible to build crosvm for ChromeOS using cargo for faster iteration times. To do so, the Cargo.toml file needs to be updated to point to dependencies provided by ChromeOS using ./setup_cros_cargo.sh.

Usage

To see the usage information for your version of crosvm, run crosvm or crosvm run --help.

Boot a Kernel

To run a very basic VM with just a kernel and default devices:

$ crosvm run "${KERNEL_PATH}"

The uncompressed kernel image, also known as vmlinux, can be found in your kernel build directory in the case of x86 at arch/x86/boot/compressed/vmlinux.

Rootfs

With a disk image

In most cases, you will want to give the VM a virtual block device to use as a root file system:

$ crosvm run -r "${ROOT_IMAGE}" "${KERNEL_PATH}"

The root image must be a path to a disk image formatted in a way that the kernel can read. Typically this is a squashfs image made with mksquashfs or an ext4 image made with mkfs.ext4. By using the -r argument, the kernel is automatically told to use that image as the root, and therefore can only be given once. More disks can be given with -d or --rwdisk if a writable disk is desired.

To run crosvm with a writable rootfs:

WARNING: Writable disks are at risk of corruption by a malicious or malfunctioning guest OS.

crosvm run --rwdisk "${ROOT_IMAGE}" -p "root=/dev/vda" vmlinux

NOTE: If more disks arguments are added prior to the desired rootfs image, the root=/dev/vda must be adjusted to the appropriate letter.

With virtiofs

Linux kernel 5.4+ is required for using virtiofs. This is convenient for testing. The file system must be named "mtd*" or "ubi*".

crosvm run --shared-dir "/:mtdfake:type=fs:cache=always" \
    -p "rootfstype=virtiofs root=mtdfake" vmlinux

Control Socket

If the control socket was enabled with -s, the main process can be controlled while crosvm is running. To tell crosvm to stop and exit, for example:

NOTE: If the socket path given is for a directory, a socket name underneath that path will be generated based on crosvm's PID.

$ crosvm run -s /run/crosvm.sock ${USUAL_CROSVM_ARGS}
    <in another shell>
$ crosvm stop /run/crosvm.sock

WARNING: The guest OS will not be notified or gracefully shutdown.

This will cause the original crosvm process to exit in an orderly fashion, allowing it to clean up any OS resources that might have stuck around if crosvm were terminated early.

Multiprocess Mode

By default crosvm runs in multiprocess mode. Each device that supports running inside of a sandbox will run in a jailed child process of crosvm. The appropriate minijail seccomp policy files must be present either in /usr/share/policy/crosvm or in the path specified by the --seccomp-policy-dir argument. The sandbox can be disabled for testing with the --disable-sandbox option.

Virtio Wayland

Virtio Wayland support requires special support on the part of the guest and as such is unlikely to work out of the box unless you are using a Chrome OS kernel along with a termina rootfs.

To use it, ensure that the XDG_RUNTIME_DIR enviroment variable is set and that the path $XDG_RUNTIME_DIR/wayland-0 points to the socket of the Wayland compositor you would like the guest to use.

GDB Support

crosvm supports GDB Remote Serial Protocol to allow developers to debug guest kernel via GDB.

You can enable the feature by --gdb flag:

# Use uncompressed vmlinux
$ crosvm run --gdb <port> ${USUAL_CROSVM_ARGS} vmlinux

Then, you can start GDB in another shell.

$ gdb vmlinux
(gdb) target remote :<port>
(gdb) hbreak start_kernel
(gdb) c
<start booting in the other shell>

For general techniques for debugging the Linux kernel via GDB, see this kernel documentation.

Defaults

The following are crosvm's default arguments and how to override them.

  • 256MB of memory (set with -m)
  • 1 virtual CPU (set with -c)
  • no block devices (set with -r, -d, or --rwdisk)
  • no network (set with --host_ip, --netmask, and --mac)
  • virtio wayland support if XDG_RUNTIME_DIR enviroment variable is set (disable with --no-wl)
  • only the kernel arguments necessary to run with the supported devices (add more with -p)
  • run in multiprocess mode (run in single process mode with --disable-sandbox)
  • no control socket (set with -s)

System Requirements

A Linux kernel with KVM support (check for /dev/kvm) is required to run crosvm. In order to run certain devices, there are additional system requirements:

  • virtio-wayland - The memfd_create syscall, introduced in Linux 3.17, and a Wayland compositor.
  • vsock - Host Linux kernel with vhost-vsock support, introduced in Linux 4.8.
  • multiprocess - Host Linux kernel with seccomp-bpf and Linux namespacing support.
  • virtio-net - Host Linux kernel with TUN/TAP support (check for /dev/net/tun) and running with CAP_NET_ADMIN privileges.

Emulated Devices

Device Description
CMOS/RTC Used to get the current calendar time.
i8042 Used by the guest kernel to exit crosvm.
serial x86 I/O port driven serial devices that print to stdout and take input from stdin.
virtio-block Basic read/write block device.
virtio-net Device to interface the host and guest networks.
virtio-rng Entropy source used to seed guest OS's entropy pool.
virtio-vsock Enabled VSOCKs for the guests.
virtio-wayland Allow guest to use host Wayland socket.

Contributing

Code Health

rustfmt

All code should be formatted with rustfmt. We have a script that applies rustfmt to all Rust code in the crosvm repo: please run bin/fmt before checking in a change. This is different from cargo fmt --all which formats multiple crates but a single workspace only; crosvm consists of multiple workspaces.

clippy

The clippy linter is used to check for common Rust problems. The crosvm project uses a specific set of clippy checks; please run bin/clippy before checking in a change.

Dependencies

ChromeOS and Android both have a review process for third party dependencies to ensure that code included in the product is safe. Since crosvm needs to build on both, this means we are restricted in our usage of third party crates. When in doubt, do not add new dependencies.

Code Overview

The crosvm source code is written in Rust and C. To build, crosvm generally requires the most recent stable version of rustc.

Source code is organized into crates, each with their own unit tests. These crates are:

  • crosvm - The top-level binary front-end for using crosvm.
  • devices - Virtual devices exposed to the guest OS.
  • kernel_loader - Loads elf64 kernel files to a slice of memory.
  • kvm_sys - Low-level (mostly) auto-generated structures and constants for using KVM.
  • kvm - Unsafe, low-level wrapper code for using kvm_sys.
  • net_sys - Low-level (mostly) auto-generated structures and constants for creating TUN/TAP devices.
  • net_util - Wrapper for creating TUN/TAP devices.
  • sys_util - Mostly safe wrappers for small system facilities such as eventfd or syslog.
  • syscall_defines - Lists of syscall numbers in each architecture used to make syscalls not supported in libc.
  • vhost - Wrappers for creating vhost based devices.
  • virtio_sys - Low-level (mostly) auto-generated structures and constants for interfacing with kernel vhost support.
  • vm_control - IPC for the VM.
  • x86_64 - Support code specific to 64 bit intel machines.

The seccomp folder contains minijail seccomp policy files for each sandboxed device. Because some syscalls vary by architecture, the seccomp policies are split by architecture.