Initial commit

fbshipit-source-id: c440d991296c92bdc5e109a11d269049e8840e94
This commit is contained in:
facebook-github-bot 2021-12-29 16:14:23 -08:00
commit 15d2f61411
226 changed files with 37281 additions and 0 deletions

91
.github/workflows/ci.yml vendored Normal file
View file

@ -0,0 +1,91 @@
name: ci
on:
push:
pull_request:
jobs:
check:
name: Check
runs-on: ubuntu-latest
steps:
- name: Install libunwind-dev
run: sudo apt-get install -y libunwind-dev
- name: Checkout sources
uses: actions/checkout@v2
- name: Install nightly toolchain
uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: nightly
override: true
- name: Run cargo check
uses: actions-rs/cargo@v1
with:
command: check
test:
name: Test Suite
runs-on: ubuntu-latest
steps:
- name: Install libunwind-dev
run: sudo apt-get install -y libunwind-dev
- name: Checkout sources
uses: actions/checkout@v2
- name: Install nightly toolchain
uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: nightly
override: true
- name: Run cargo test
uses: actions-rs/cargo@v1
with:
command: test
args: -- --test-threads=1
## Currently disabled because internal version of rustfmt produces different
## formatting.
# rustfmt:
# name: Check format
# runs-on: ubuntu-latest
# steps:
# - name: Checkout sources
# uses: actions/checkout@v2
#
# - name: Install nightly toolchain
# uses: actions-rs/toolchain@v1
# with:
# profile: minimal
# toolchain: nightly
# override: true
# components: rustfmt
#
# - name: Run cargo fmt
# uses: actions-rs/cargo@v1
# with:
# command: fmt
# args: --all -- --check
clippy:
name: Clippy
runs-on: ubuntu-latest
steps:
- name: Install libunwind-dev
run: sudo apt-get install -y libunwind-dev
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
toolchain: nightly
components: clippy
override: true
- uses: actions-rs/clippy-check@v1
with:
token: ${{ secrets.GITHUB_TOKEN }}

2
.gitignore vendored Normal file
View file

@ -0,0 +1,2 @@
target/
Cargo.lock

5
CHANGELOG.md Normal file
View file

@ -0,0 +1,5 @@
# Reverie
## 0.1.0 (December 1, 2021)
- Initial release

80
CODE_OF_CONDUCT.md Normal file
View file

@ -0,0 +1,80 @@
# Code of Conduct
## Our Pledge
In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to make participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.
## Our Standards
Examples of behavior that contributes to creating a positive environment
include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.
## Scope
This Code of Conduct applies within all project spaces, and it also applies when
an individual is representing the project or its community in public spaces.
Examples of representing a project or community include using an official
project e-mail address, posting via an official social media account, or acting
as an appointed representative at an online or offline event. Representation of
a project may be further defined and clarified by project maintainers.
This Code of Conduct also applies outside the project spaces when there is a
reasonable belief that an individual's behavior may have a negative impact on
the project or its community.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at <opensource-conduct@fb.com>. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq

46
CONTRIBUTING.md Normal file
View file

@ -0,0 +1,46 @@
# Contributing to Reverie
We want to make contributing to this project as easy and transparent as
possible.
## Our Development Process
Reverie is currently developed in Meta's internal repositories and then
exported out to GitHub by a Meta team member; however, we invite you to
submit pull requests as described below.
## Pull Requests
We actively welcome your pull requests.
1. Fork the repo and create your branch from `main`.
2. If you've added code that should be tested, add tests.
3. If you've changed APIs, update the documentation.
4. Ensure the test suite passes.
5. Make sure your code lints.
6. If you haven't already, complete the Contributor License Agreement ("CLA").
## Contributor License Agreement ("CLA")
In order to accept your pull request, we need you to submit a CLA. You only
need to do this once to work on any of Meta's open source projects.
Complete your CLA here: <https://code.facebook.com/cla>
## Issues
We use GitHub issues to track public bugs. Please ensure your description is
clear and has sufficient instructions to be able to reproduce the issue.
Meta has a [bounty program](https://www.facebook.com/whitehat/) for the safe
disclosure of security bugs. In those cases, please go through the process
outlined on that page and do not file a public issue.
## Coding Style
Follow the automatic `rustfmt` configuration.
## License
By contributing to Reverie, you agree that your contributions will be
licensed under the LICENSE file in the root directory of this source tree.

9
Cargo.toml Normal file
View file

@ -0,0 +1,9 @@
[workspace]
members = [
"reverie",
"reverie-examples",
"reverie-process",
"reverie-ptrace",
"reverie-syscalls",
"reverie-util",
]

31
LICENSE Normal file
View file

@ -0,0 +1,31 @@
Copyright notices are include in each source file. For other files,
the below copyright applies:
Copyright (c) 2018-2019, Trustees of Indiana University
("University Works" via Baojun Wang)
Copyright (c) 2018-2019, Ryan Newton
("Traditional Works of Scholarship")
Copyright (c) 2020-, Facebook, Inc. and its affiliates.
BSD 2-Clause License
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

110
README.md Normal file
View file

@ -0,0 +1,110 @@
# Reverie
Reverie is a user space system-call interception framework for x86-64 Linux.
It can be used to intercept, modify, or elide a syscall before the kernel
executes it. In essence, Reverie sits at the boundary between user space and
kernel space.
Some potential use cases include:
* Observability tools, like `strace`.
* Failure injection to test error handling logic.
* Manipulating scheduling decisions to expose concurrency bugs.
See the [`reverie-examples`](reverie-examples) directory for examples of
tools that can be built with this library.
## Features
* Ergonomic syscall handling. It is easy to modify syscall arguments or return
values, inject multiple syscalls, or suppress the syscall entirely.
* Async-await usage allows blocking syscalls to be handled without blocking
other guest threads.
* Can intercept CPUID and RDTSC instructions.
* Typed syscalls. Every syscall has a wrapper to make it easier to access
pointer values. This also enables strace-like pretty-printing for free.
* Avoid intercepting syscalls we don't care about. For example, if we only care
about `sys_open`, we can avoid paying the cost of intercepting other
syscalls.
* Can act as a GDB server. This allows connection via the GDB client where you
can step through the process that is being traced by Reverie.
## Terminology and Background
Clients of the Reverie library write ***tools***. A tool runs a shell command
creating a ***guest*** process tree, comprised of multiple guest threads and
processes, in an instrumented manner. Each Reverie tool is written as a set
of callbacks (i.e. ***handlers***), which are invoked each time a guest
thread encounters a trappable event such as a system call or inbound signal.
The tool can stipulate exactly which events streams it ***subscribes*** to.
The tool itself is stateful, maintaining state between consecutive
invocations.
## Usage
Currently, there is only the `reverie-ptrace` backend which uses `ptrace` to
intercept syscalls. Copy one of the example tools to a new Rust project (e.g.
`cargo init`). Youll see that it depends both on the general `reverie` crate
for the API and on the specific backend implementation crate,
`reverie_ptrace`.
## Performance
Since `ptrace` adds significant overhead when the guest has a syscall-heavy
workload, Reverie will add similarly-significant overhead. The slowdown depends
on how many syscalls are being performed and are intercepted by the tool.
The primary way you can improve performance with the current implementation is
to implement the `subscriptions` callback, specifying a minimal set of syscalls
that are actually required by your tool.
## Overall architecture
When implementing a Reverie tool, there are three main components of the tool to
consider:
* The process-level state,
* the thread-level state, and
* the global state (which is shared among all processes and threads in the
traced process tree).
This separation of process-, thread-, and global-state is meant to provide an
abstraction that allows future Reverie backends to be used without requiring the
tool to be rewritten.
![architecture](./assets/architecture-diagram.svg "Architecture Diagram")
### Process State
Whenever a new process is spawned (i.e., when `fork` or `clone` is called by the
guest), a new instance of the process state struct is created and managed by the
Reverie backend.
### Thread State
When a syscall is intercepted, it is always associated with the thread that
called it.
### Global State
The global state is accessed via RPC messages. Since a future Reverie backend
may use in-guest syscall interception, the syscall handler code may not be
running in the same address space. Thus, all shared state is communicated via
RPC messages. (There is, however, currently only a single ptrace-based backend
where all tracer code is in the same address space.)
## Future Plans
* Add a more performant backend. The rough goal is to have handlers executing in
the guest with close to regular functional call overhead. Global state and its
methods will still be centralized, but the RPC/IPC mechanism between guest &
the centralized tool process will become much more efficient.
## Contributing
Contributions are welcome! Please see the [CONTRIBUTING.md](CONTRIBUTING.md)
file for guidance.
## License
Reverie is BSD 2-Clause licensed as found in the [LICENSE](LICENSE) file.

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 16 KiB

View file

@ -0,0 +1,61 @@
# @generated by autocargo
[package]
name = "reverie-examples"
version = "0.1.0"
authors = ["Facebook"]
edition = "2021"
license = "BSD-2-Clause"
publish = false
[[bin]]
name = "chaos"
path = "chaos.rs"
[[bin]]
name = "chrome_trace"
path = "chrome-trace/main.rs"
[[bin]]
name = "chunky_print"
path = "chunky_print.rs"
[[bin]]
name = "counter1"
path = "counter1.rs"
[[bin]]
name = "counter2"
path = "counter2.rs"
[[bin]]
name = "debug"
path = "debug.rs"
[[bin]]
name = "noop"
path = "noop.rs"
[[bin]]
name = "pedigree"
path = "pedigree.rs"
[[bin]]
name = "strace"
path = "strace/main.rs"
[[bin]]
name = "strace_minimal"
path = "strace_minimal.rs"
[dependencies]
anyhow = "1.0.51"
nix = "0.22"
reverie = { version = "0.1.0", path = "../reverie" }
reverie-ptrace = { version = "0.1.0", path = "../reverie-ptrace" }
reverie-util = { version = "0.1.0", path = "../reverie-util" }
serde = { version = "1.0.126", features = ["derive", "rc"] }
serde_json = { version = "1.0.64", features = ["float_roundtrip", "unbounded_depth"] }
structopt = "0.3.23"
tokio = { version = "1.10", features = ["full", "test-util", "tracing"] }
tracing = "0.1.29"

View file

@ -0,0 +1,63 @@
# Examples
Example tools built on top of Reverie.
Copying one of these examples is the recommended way to get started using
Reverie.
# chrome-trace: Generates a chrome trace file
This tool is like `strace`, but generates a trace file that can be loaded in
`chrome://tracing/`.
# counter1: Reverie Counter Tool (1)
This is a basic example of event counting. It counts the number of system
calls and reports that single integer at exit.
This version of tool uses a single, centralized piece of global state.
# counter2: Reverie Counter Tool (2)
This is a basic example of event counting. This tool counts the number of
system calls and reports that single integer at exit.
This implementation of the tool uses a *distributed* notion of state,
maintaining a per-thread, per-process, and global state. Basically, this is
an example of "MapReduce" style tracing of a process tree.
# noop: Identity Function Tool
This instrumentation tool intercepts events but does nothing with them. It is
useful for observing the overhead of interception, and as a starting point.
# chunky_print: Print-gating Tool
This example tool intercepts write events on stdout and stderr and
manipulates either when those outputs are released, or the scheduling order
that determines the order of printed output.
# pedigree: Deterministic virtual process IDs
This tool monitors the spawning of new processes and maps each new PID to a
deterministic virtual PID. The new virtual PID is reported after each
process-spawning syscall.
This tool is a work-in-progress and is not yet functioning.
`pedigree.rs` is an implementation of pedigree / virtual PID generation using local state.
`virtual_process_tree.rs` is an implementation which uses global state.
# strace: Reverie Echo Tool
This instrumentation tool simply echos intercepted events, like strace.
# chaos: Chaos Tool
This tool is meant to emulate a pathological kernel where:
1. `read` and `recvfrom` calls return only one byte at a time. This is
intended to catch errors in parsers that assume multiple bytes will be
returned at a time.
2. `EINTR` is returned instead of running the real syscall for every other
read.

179
reverie-examples/chaos.rs Normal file
View file

@ -0,0 +1,179 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use serde::{Deserialize, Serialize};
use std::sync::atomic::{AtomicU64, Ordering};
use structopt::StructOpt;
use reverie::{
syscalls::{Displayable, Errno, Syscall},
Error, GlobalTool, Guest, Pid, Tool,
};
use reverie_util::CommonToolArguments;
/// A tool to introduce inject "chaos" into a running process. A pathological
/// kernel is simulated by forcing reads to only return one byte a time.
#[derive(Debug, StructOpt)]
struct Args {
#[structopt(flatten)]
common_opts: CommonToolArguments,
#[structopt(flatten)]
chaos_opts: ChaosOpts,
}
#[derive(StructOpt, Debug, Serialize, Deserialize, Clone, Default)]
struct ChaosOpts {
/// Skips the first N syscalls of a process before doing any intervention.
/// This is useful when you need to skip past an error caused by the tool.
#[structopt(long, value_name = "N", default_value = "0")]
skip: u64,
/// If set, does not intercept `read`-like system calls and modify them.
#[structopt(long)]
no_read: bool,
/// If set, does not intercept `recv`-like system calls and modify them.
#[structopt(long)]
no_recv: bool,
/// If set, does not inject random `EINTR` errors.
#[structopt(long)]
no_interrupt: bool,
}
#[derive(Debug, Serialize, Deserialize, Default)]
struct ChaosTool {
count: AtomicU64,
}
impl Clone for ChaosTool {
fn clone(&self) -> Self {
ChaosTool {
count: AtomicU64::new(self.count.load(Ordering::SeqCst)),
}
}
}
#[derive(Debug, Serialize, Deserialize, Default, Clone)]
struct ChaosToolGlobal {}
#[reverie::global_tool]
impl GlobalTool for ChaosToolGlobal {
type Config = ChaosOpts;
async fn receive_rpc(&self, _from: Pid, _request: ()) {}
}
#[reverie::tool]
impl Tool for ChaosTool {
type ThreadState = bool;
type GlobalState = ChaosToolGlobal;
fn new(_pid: Pid, _cfg: &ChaosOpts) -> Self {
Self {
count: AtomicU64::new(0),
}
}
async fn handle_syscall_event<T: Guest<Self>>(
&self,
guest: &mut T,
syscall: Syscall,
) -> Result<i64, Error> {
let count = self.count.fetch_add(1, Ordering::SeqCst);
let config = guest.config().clone();
let memory = guest.memory();
// This provides a way to wait until the dynamic linker has done its job
// before we start trying to create chaos. glibc's dynamic linker has a
// bug where it doesn't retry `read` calls that don't return the
// expected amount of data.
if count < config.skip {
eprintln!(
"SKIPPED [pid={}, n={}] {}",
guest.pid(),
count,
syscall.display(&memory),
);
return guest.tail_inject(syscall).await;
}
// Transform the syscall arguments.
let syscall = match syscall {
Syscall::Read(read) => {
if !config.no_interrupt && !*guest.thread_state() {
// Return an EINTR instead of running the syscall.
// Programs should always retry the read in this case.
*guest.thread_state_mut() = true;
// XXX: inject a signal like SIGINT?
let ret = Err(Errno::ERESTARTSYS);
eprintln!(
"[pid={}, n={}] {} = {}",
guest.pid(),
count,
syscall.display(&memory),
ret.unwrap_or_else(|errno| -errno.into_raw() as i64)
);
return Ok(ret?);
} else if !config.no_read {
// Reduce read length to 1 byte at most.
Syscall::Read(read.with_len(1.min(read.len())))
} else {
// Return syscall unmodified.
Syscall::Read(read)
}
}
Syscall::Recvfrom(recv) if !config.no_recv => {
// Reduce recv length to 1 byte at most.
Syscall::Recvfrom(recv.with_len(1.min(recv.len())))
}
x => {
eprintln!(
"[pid={}, n={}] {}",
guest.pid(),
count,
syscall.display(&memory),
);
return guest.tail_inject(x).await;
}
};
*guest.thread_state_mut() = false;
let ret = guest.inject(syscall).await;
eprintln!(
"[pid={}, n={}] {} = {}",
guest.pid(),
count,
syscall.display_with_outputs(&memory),
ret.unwrap_or_else(|errno| -errno.into_raw() as i64)
);
Ok(ret?)
}
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = Args::from_args();
let log_guard = args.common_opts.init_tracing();
let tracer = reverie_ptrace::TracerBuilder::<ChaosTool>::new(args.common_opts.into())
.config(args.chaos_opts)
.spawn()
.await?;
let (status, _) = tracer.wait().await?;
drop(log_guard); // Flush logs before exiting.
status.raise_or_exit()
}

View file

@ -0,0 +1,169 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use std::path::PathBuf;
use std::time::SystemTime;
use reverie::syscalls::Sysno;
use reverie::Errno;
use reverie::ExitStatus;
use reverie::Pid;
use reverie::Tid;
use serde::{Deserialize, Serialize};
use serde_json::json;
/// A message sent to the global state whenever a thread shuts down.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ThreadExit {
/// Process ID.
pub pid: Pid,
/// Thread ID.
pub tid: Tid,
/// The start time of the thread.
pub start: SystemTime,
/// The end time of the thread.
pub end: SystemTime,
/// The series of events from this thread.
pub events: Vec<Event>,
/// The final exit status of this thread.
pub exit_status: ExitStatus,
}
// TODO: Handle signal, rdtsc, and cpuid events.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum Event {
/// A syscall event. Happens whenever a syscall happens.
Syscall {
/// The time at which the syscall started.
start: SystemTime,
/// The time at which the syscall completed.
end: SystemTime,
/// The syscall number.
sysno: Sysno,
/// The formatted syscall with all of its arguments.
pretty: String,
/// The result of the syscall.
result: Result<i64, Errno>,
},
/// A successful execve event.
Exec {
/// The time at which the execve syscall was executed.
timestamp: SystemTime,
/// The program being executed.
program: Program,
},
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Program {
/// The path to the program.
pub name: PathBuf,
/// The program arguments.
pub args: Vec<String>,
}
impl Program {
pub fn new(name: PathBuf, args: Vec<String>) -> Self {
Self { name, args }
}
}
impl ThreadExit {
pub fn trace_event(&self, epoch: SystemTime, events: &mut Vec<serde_json::Value>) {
let thread_name = format!("TID {}", self.tid);
// Record the thread/process start.
{
let ts = self.start.duration_since(epoch).unwrap().as_micros() as u64;
events.push(json!({
"name": thread_name,
"cat": "process",
"ph": "B",
"ts": ts,
"pid": self.pid,
"tid": self.tid,
}));
}
for event in &self.events {
match event {
Event::Syscall {
start,
end,
sysno,
pretty,
result,
} => {
let ts = start.duration_since(epoch).unwrap().as_micros() as u64;
let duration = end.duration_since(*start).unwrap().as_micros() as u64;
events.push(json!({
"name": sysno.to_string(),
"cat": "syscall",
"ph": "X",
"ts": ts,
"dur": duration,
"pid": self.pid,
"tid": self.tid,
"args": {
"pretty": pretty,
"result": format!("{:?}", result),
},
}));
}
Event::Exec { timestamp, program } => {
let ts = timestamp.duration_since(epoch).unwrap().as_micros() as u64;
// FIXME: This shouldn't be an "instant" event. We should be
// able to determine the duration of the execve call.
events.push(json!({
"name": "execve",
"cat": "syscall",
"ph": "i",
"ts": ts,
"pid": self.pid,
"tid": self.tid,
"args": {
"program": program,
}
}));
}
}
}
// Record the thread/process exit.
{
let ts = self.end.duration_since(epoch).unwrap().as_micros() as u64;
events.push(json!({
"name": thread_name,
"cat": "process",
"ph": "E",
"ts": ts,
"pid": self.pid,
"tid": self.tid,
"args": {
"exit_status": self.exit_status,
}
}));
}
}
}

View file

@ -0,0 +1,71 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use reverie::GlobalTool;
use reverie::Pid;
use serde::Deserialize;
use serde::Serialize;
use crate::event::ThreadExit;
use std::io;
use std::path::PathBuf;
use std::sync::Mutex;
use std::time::SystemTime;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Program {
/// The path to the program.
name: PathBuf,
/// The program arguments.
args: Vec<String>,
}
#[derive(Debug)]
pub struct GlobalState {
epoch: SystemTime,
events: Mutex<Vec<ThreadExit>>,
}
impl Default for GlobalState {
fn default() -> Self {
Self {
epoch: SystemTime::now(),
events: Default::default(),
}
}
}
#[reverie::global_tool]
impl GlobalTool for GlobalState {
type Request = ThreadExit;
type Response = ();
async fn receive_rpc(&self, _pid: Pid, event: ThreadExit) {
let mut events = self.events.lock().unwrap();
events.push(event);
}
}
impl GlobalState {
/// Writes out a chrome trace file to the given writer.
pub fn chrome_trace<W: io::Write>(&self, writer: &mut W) -> serde_json::Result<()> {
let events = self.events.lock().unwrap();
let mut json: Vec<serde_json::Value> = Vec::new();
for event in events.iter() {
event.trace_event(self.epoch, &mut json);
}
let json = serde_json::Value::Array(json);
serde_json::to_writer(writer, &json)
}
}

View file

@ -0,0 +1,61 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
//! Runs a process, gathering metadata about all of the processes that were ran
//! and displays it as a tree using Graphviz.
mod event;
mod global_state;
mod tool;
use tool::ChromeTrace;
use structopt::StructOpt;
use anyhow::Context;
use reverie::Error;
use reverie_util::CommonToolArguments;
use std::fs;
use std::io;
use std::path::PathBuf;
/// A tool to render a summary of the process tree.
#[derive(Debug, StructOpt)]
struct Args {
#[structopt(flatten)]
common: CommonToolArguments,
/// The path to write out Chrome trace file. This can be loaded with
/// `chrome://tracing`.
#[structopt(long)]
out: Option<PathBuf>,
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = Args::from_args();
let log_guard = args.common.init_tracing();
let tracer = reverie_ptrace::TracerBuilder::<ChromeTrace>::new(args.common.into())
.spawn()
.await?;
let (status, global_state) = tracer.wait().await?;
if let Some(path) = args.out {
let mut f = io::BufWriter::new(fs::File::create(path)?);
global_state
.chrome_trace(&mut f)
.context("failed to generate Chrome trace")?;
}
// Flush logs before exiting.
drop(log_guard);
status.raise_or_exit()
}

View file

@ -0,0 +1,154 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::event::Event;
use crate::event::Program;
use crate::event::ThreadExit;
use crate::global_state::GlobalState;
use reverie::syscalls::SyscallInfo;
use reverie::{
syscalls::{Displayable, Syscall},
Errno, Error, ExitStatus, GlobalRPC, GlobalTool, Guest, Pid, Subscription, Tid, Tool,
};
use serde::{Deserialize, Serialize};
use std::borrow::Cow;
use std::fs;
use std::str;
use std::time::SystemTime;
#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct ChromeTrace(Pid);
impl Default for ChromeTrace {
fn default() -> Self {
unreachable!("never used")
}
}
#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct ThreadState {
/// Time stamp when this thread was spawned.
start: SystemTime,
/// The events that have occurred on this thread. These will be sent to the
/// global state upon thread exit.
events: Vec<Event>,
}
impl Default for ThreadState {
fn default() -> Self {
Self {
start: SystemTime::now(),
events: Vec::new(),
}
}
}
impl ThreadState {
pub fn push(&mut self, event: Event) {
self.events.push(event)
}
}
#[reverie::tool]
impl Tool for ChromeTrace {
type GlobalState = GlobalState;
type ThreadState = ThreadState;
fn new(pid: Pid, _cfg: &<Self::GlobalState as GlobalTool>::Config) -> Self {
Self(pid)
}
fn subscriptions(_cfg: &<Self::GlobalState as GlobalTool>::Config) -> Subscription {
Subscription::all_syscalls()
}
async fn handle_syscall_event<T: Guest<Self>>(
&self,
guest: &mut T,
syscall: Syscall,
) -> Result<i64, Error> {
match syscall {
Syscall::Exit(_) | Syscall::ExitGroup(_) => {
// TODO: Record exits
guest.tail_inject(syscall).await
}
Syscall::Execve(_) | Syscall::Execveat(_) => {
// TODO: Record failed execs
guest.tail_inject(syscall).await
}
_ => {
let start = SystemTime::now();
let result = guest.inject(syscall).await;
let end = SystemTime::now();
let sysno = syscall.number();
let pretty = syscall.display_with_outputs(&guest.memory()).to_string();
guest.thread_state_mut().push(Event::Syscall {
start,
end,
sysno,
pretty,
result,
});
Ok(result?)
}
}
}
async fn handle_post_exec<T: Guest<Self>>(&self, guest: &mut T) -> Result<(), Errno> {
let program = fs::read_link(format!("/proc/{}/exe", guest.pid())).unwrap();
let mut cmdline = fs::read(format!("/proc/{}/cmdline", guest.pid())).unwrap();
// Shave off the extra NUL terminator at the end so we don't end up with
// an empty arg at the end.
assert_eq!(cmdline.pop(), Some(b'\0'));
let args: Vec<_> = cmdline
.split(|byte| *byte == 0)
.map(String::from_utf8_lossy)
.map(Cow::into_owned)
.collect();
guest.thread_state_mut().push(Event::Exec {
timestamp: SystemTime::now(),
program: Program::new(program, args),
});
Ok(())
}
async fn on_exit_thread<G: GlobalRPC<Self::GlobalState>>(
&self,
tid: Tid,
global_state: &G,
thread_state: Self::ThreadState,
exit_status: ExitStatus,
) -> Result<(), Error> {
global_state
.send_rpc(ThreadExit {
pid: self.0,
tid,
start: thread_state.start,
end: SystemTime::now(),
events: thread_state.events,
exit_status,
})
.await?;
Ok(())
}
}

View file

@ -0,0 +1,258 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use reverie::{
syscalls::{Addr, MemoryAccess, Syscall},
Error, GlobalTool, Guest, Tid, Tool,
};
use reverie_util::CommonToolArguments;
use serde::{Deserialize, Serialize};
use std::{
collections::HashMap,
fmt::Write,
io,
sync::{
atomic::{AtomicBool, Ordering},
Mutex,
},
vec::Vec,
};
use structopt::StructOpt;
use tracing::{debug, info, trace};
/// This tool will chunk together printed output from each thread, over fixed time intervals.
/// How many system calls (in each thread) define an epoch?
const EPOCH: u64 = 10;
#[derive(PartialEq, Debug, Eq, Hash, Clone, Serialize, Deserialize, Copy)]
pub enum Which {
Stderr,
Stdout,
}
/// Send individual print attepmts (write calls) to the global object:
#[derive(PartialEq, Debug, Eq, Hash, Clone, Serialize, Deserialize)]
pub enum Msg {
/// Route a print over to the tracer to issue.
Print(Which, Vec<u8>),
/// Tick the logical clock.
Tick,
/// Print all buffered messages, cutting off the epoch early
Flush,
}
type LogicalTime = u64;
#[derive(Debug, Default)]
struct ChunkyPrintGlobal(Mutex<Inner>);
#[derive(Debug, Default)]
struct Inner {
times: HashMap<Tid, LogicalTime>,
printbuf: HashMap<Tid, Vec<(Which, Vec<u8>)>>,
epoch_num: u64,
}
#[reverie::global_tool]
impl GlobalTool for ChunkyPrintGlobal {
type Request = Msg;
type Response = ();
async fn receive_rpc(&self, from: Tid, m: Msg) {
let mut mg = self.0.lock().unwrap();
match m {
Msg::Print(w, s) => {
let v = mg.printbuf.entry(from).or_insert_with(Vec::new);
v.push((w, s));
}
Msg::Tick => {
let ticks = mg.times.entry(from).or_insert(0);
*ticks += 1;
mg.check_epoch();
}
Msg::Flush => {
let _ = mg.flush_messages();
}
}
}
}
impl Inner {
/// Check if the epoch has expired and flush the buffer.
fn check_epoch(&mut self) {
if self.times.iter().all(|(_p, t)| (*t > EPOCH)) {
let _ = self.flush_messages();
self.times.iter_mut().for_each(|(_, t)| *t -= EPOCH);
self.epoch_num += 1;
}
}
fn flush_messages(&mut self) -> io::Result<()> {
let non_empty = self
.printbuf
.iter()
.fold(0, |acc, (_, v)| if v.is_empty() { acc } else { acc + 1 });
if non_empty > 1 {
let mut strbuf = String::new();
for (tid, v) in self.printbuf.iter() {
let _ = write!(&mut strbuf, "tid {}:{{", tid);
let mut iter = v.iter();
if let Some((_, b)) = iter.next() {
let _ = write!(&mut strbuf, "{}", b.len());
for (_, b) in iter {
let _ = write!(&mut strbuf, ", {}", b.len());
}
}
let _ = write!(&mut strbuf, "}} ");
}
info!(
" [chunky_print] {} threads concurrent output in epoch {}, sizes: {}",
non_empty, self.epoch_num, strbuf
);
} else {
debug!(
" [chunky_print] output from {} thread(s) in epoch {}: {} bytes",
non_empty,
self.epoch_num,
self.printbuf
.iter()
.fold(0, |acc, (_, v)| v.iter().fold(acc, |a, (_, b)| a + b.len()))
);
}
for (tid, v) in self.printbuf.iter_mut() {
for (w, b) in v.iter() {
match w {
Which::Stdout => {
trace!(
" [chunky_print] writing {} bytes to stdout from tid {}",
b.len(),
tid
);
io::Write::write_all(&mut io::stdout(), b)?;
}
Which::Stderr => {
trace!(
" [chunky_print] writing {} bytes to stderr from tid {}",
b.len(),
tid
);
io::Write::write_all(&mut io::stderr(), b)?;
}
}
}
v.clear();
}
io::Write::flush(&mut io::stdout())?;
io::Write::flush(&mut io::stderr())?;
Ok(())
}
}
#[derive(Debug, Serialize, Deserialize, Default)]
struct ChunkyPrintLocal {
stdout_disconnected: AtomicBool,
stderr_disconnected: AtomicBool,
}
impl Clone for ChunkyPrintLocal {
fn clone(&self) -> Self {
ChunkyPrintLocal {
stdout_disconnected: AtomicBool::new(self.stdout_disconnected.load(Ordering::SeqCst)),
stderr_disconnected: AtomicBool::new(self.stderr_disconnected.load(Ordering::SeqCst)),
}
}
}
fn read_tracee_memory<T: Guest<ChunkyPrintLocal>>(
guest: &T,
addr: Addr<u8>,
len: usize,
) -> Result<Vec<u8>, Error> {
let mut buf = vec![0; len];
guest.memory().read_exact(addr, &mut buf)?;
Ok(buf)
}
#[reverie::tool]
impl Tool for ChunkyPrintLocal {
type GlobalState = ChunkyPrintGlobal;
type ThreadState = ();
async fn handle_syscall_event<T: Guest<Self>>(
&self,
guest: &mut T,
call: Syscall,
) -> Result<i64, Error> {
let _ = guest.send_rpc(Msg::Tick).await;
match call {
// Here we make some attempt to catch redirections:
Syscall::Dup2(d) => {
let newfd = d.newfd();
if newfd == 1 {
self.stdout_disconnected.store(true, Ordering::SeqCst);
}
if newfd == 2 {
self.stderr_disconnected.store(true, Ordering::SeqCst);
}
guest.tail_inject(call).await
}
Syscall::Write(w) => {
match w.fd() {
1 | 2 => {
let which = if w.fd() == 1 {
if self.stdout_disconnected.load(Ordering::SeqCst) {
debug!(
" [chunky_print] letting through write on redirected stdout, {} bytes.",
w.len()
);
return guest.tail_inject(call).await;
}
Which::Stdout
} else {
if self.stderr_disconnected.load(Ordering::SeqCst) {
debug!(
" [chunky_print] letting through write on redirected stderr, {} bytes.",
w.len()
);
return guest.tail_inject(call).await;
}
Which::Stderr
};
let buf = read_tracee_memory(guest, w.buf().unwrap(), w.len())?;
let _ = guest.send_rpc(Msg::Print(which, buf)).await;
info!(
" [chunky_print] suppressed write of {} bytes to fd {}",
w.len(),
w.fd()
);
// Suppress the original system call:
Ok(w.len() as i64)
}
_ => guest.tail_inject(call).await,
}
}
_ => guest.tail_inject(call).await,
}
}
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = CommonToolArguments::from_args();
let log_guard = args.init_tracing();
let tracer = reverie_ptrace::TracerBuilder::<ChunkyPrintLocal>::new(args.into())
.spawn()
.await?;
let (status, global_state) = tracer.wait().await?;
trace!(" [chunky_print] global exit, flushing last messages.");
let _ = global_state.0.lock().unwrap().flush_messages();
drop(log_guard); // Flush logs before exiting.
status.raise_or_exit()
}

View file

@ -0,0 +1,78 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
//! An example that counts system calls using a simple, global state.
use reverie::{
syscalls::{Syscall, SyscallInfo, Sysno},
Error, GlobalTool, Guest, Pid, Tool,
};
use reverie_util::CommonToolArguments;
use serde::{Deserialize, Serialize};
use std::sync::atomic::{AtomicU64, Ordering};
use structopt::StructOpt;
#[derive(Debug, Serialize, Deserialize, Default)]
struct CounterGlobal {
num_syscalls: AtomicU64,
}
#[derive(Debug, Serialize, Deserialize, Default, Clone)]
struct CounterLocal {}
/// The message sent to the global state method.
/// This contains the syscall number.
#[derive(PartialEq, Debug, Eq, Clone, Copy, Serialize, Deserialize)]
pub struct IncrMsg(Sysno);
#[reverie::global_tool]
impl GlobalTool for CounterGlobal {
type Request = IncrMsg;
type Response = ();
async fn init_global_state(_: &Self::Config) -> Self {
CounterGlobal {
num_syscalls: AtomicU64::new(0),
}
}
async fn receive_rpc(&self, _from: Pid, IncrMsg(sysno): IncrMsg) -> Self::Response {
AtomicU64::fetch_add(&self.num_syscalls, 1, Ordering::SeqCst);
tracing::info!("count at syscall ({:?}): {:?}", sysno, self.num_syscalls);
}
}
#[reverie::tool]
impl Tool for CounterLocal {
type GlobalState = CounterGlobal;
async fn handle_syscall_event<T: Guest<Self>>(
&self,
guest: &mut T,
syscall: Syscall,
) -> Result<i64, Error> {
let sysno = syscall.number();
let _ = guest.send_rpc(IncrMsg(sysno)).await?;
guest.tail_inject(syscall).await
}
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = CommonToolArguments::from_args();
let log_guard = args.init_tracing();
let tracer = reverie_ptrace::TracerBuilder::<CounterLocal>::new(args.into())
.spawn()
.await?;
let (status, global_state) = tracer.wait().await?;
eprintln!(
" [counter tool] Total system calls in process tree: {}",
AtomicU64::load(&global_state.num_syscalls, Ordering::SeqCst)
);
drop(log_guard); // Flush logs before exiting.
status.raise_or_exit()
}

View file

@ -0,0 +1,157 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
//! An example that counts system calls using a simple, global state.
use reverie::{
syscalls::{Syscall, SyscallInfo},
Error, ExitStatus, GlobalRPC, GlobalTool, Guest, Pid, Tid, Tool,
};
use reverie_util::CommonToolArguments;
use structopt::StructOpt;
use core::sync::atomic::{AtomicU64, Ordering};
use serde::{Deserialize, Serialize};
use std::sync::Mutex;
use tracing::debug;
/// Global state for the tool.
#[derive(Debug, Serialize, Deserialize, Default)]
pub struct GlobalInner {
pub total_syscalls: u64,
pub exited_procs: u64,
pub exited_threads: u64,
}
#[derive(Debug, Serialize, Deserialize, Default)]
pub struct CounterGlobal {
pub inner: Mutex<GlobalInner>,
}
/// Local, per-process state for the tool.
#[derive(Debug, Serialize, Deserialize, Default)]
pub struct CounterLocal {
proc_syscalls: AtomicU64,
exited_threads: AtomicU64,
}
impl Clone for CounterLocal {
fn clone(&self) -> Self {
CounterLocal {
proc_syscalls: AtomicU64::new(self.proc_syscalls.load(Ordering::SeqCst)),
exited_threads: AtomicU64::new(self.exited_threads.load(Ordering::SeqCst)),
}
}
}
/// The message sent to the global state method.
#[derive(PartialEq, Debug, Eq, Hash, Clone, Serialize, Deserialize, Copy)]
pub struct IncrMsg(u64, u64);
#[reverie::global_tool]
impl GlobalTool for CounterGlobal {
type Request = IncrMsg;
type Response = ();
async fn init_global_state(_: &Self::Config) -> Self {
CounterGlobal {
inner: Mutex::new(GlobalInner {
total_syscalls: 0,
exited_procs: 0,
exited_threads: 0,
}),
}
}
async fn receive_rpc(&self, _from: Pid, IncrMsg(n, t): IncrMsg) -> Self::Response {
let mut mg = self.inner.lock().unwrap();
mg.total_syscalls += n;
mg.exited_threads += t;
mg.exited_procs += 1;
}
}
#[reverie::tool]
impl Tool for CounterLocal {
type GlobalState = CounterGlobal;
/// Yet another level of counters per-thread:
type ThreadState = u64;
fn new(pid: Pid, _cfg: &()) -> Self {
debug!(" [counter] initialize counter for pid {}", pid);
CounterLocal {
proc_syscalls: AtomicU64::new(0),
exited_threads: AtomicU64::new(0),
}
}
async fn handle_syscall_event<T: Guest<Self>>(
&self,
guest: &mut T,
call: Syscall,
) -> Result<i64, Error> {
*guest.thread_state_mut() += 1;
debug!(
"thread count at syscall ({:?}): {}, process count: {}",
call.number(),
guest.thread_state(),
self.proc_syscalls.load(Ordering::SeqCst)
);
guest.tail_inject(call).await
}
async fn on_exit_thread<G: GlobalRPC<Self::GlobalState>>(
&self,
tid: Tid,
_global_state: &G,
ts: u64,
_exit_status: ExitStatus,
) -> Result<(), Error> {
debug!("count at exit thread {} = {}", tid, &ts);
self.proc_syscalls.fetch_add(ts, Ordering::SeqCst);
self.exited_threads.fetch_add(1, Ordering::SeqCst);
debug!(
" contributed to process-level count: {}",
self.proc_syscalls.load(Ordering::Relaxed)
);
Ok(())
}
async fn on_exit_process<G: GlobalRPC<Self::GlobalState>>(
self,
pid: Pid,
global_state: &G,
_exit_status: ExitStatus,
) -> Result<(), Error> {
let count = self.proc_syscalls.load(Ordering::SeqCst);
let threads = self.exited_threads.load(Ordering::SeqCst);
drop(self);
debug!(
"At ExitProc (pid {}), contributing {} to global count.",
pid, count
);
let _ = global_state.send_rpc(IncrMsg(count, threads)).await?;
Ok(())
}
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = CommonToolArguments::from_args();
let log_guard = args.init_tracing();
let tracer = reverie_ptrace::TracerBuilder::<CounterLocal>::new(args.into())
.spawn()
.await?;
let (status, global_state) = tracer.wait().await?;
let mg = global_state.inner.lock().unwrap();
eprintln!(
" [counter tool] Total system calls in process tree: {}, from {} processes, {} thread(s).",
mg.total_syscalls, mg.exited_procs, mg.exited_threads
);
drop(log_guard); // Flush logs before exiting.
status.raise_or_exit()
}

50
reverie-examples/debug.rs Normal file
View file

@ -0,0 +1,50 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
//! This instrumentation tool intercepts events but does nothing with them,
//! except acting as a gdbserver.
use reverie::{Error, Subscription, Tool};
use reverie_util::CommonToolArguments;
use serde::{Deserialize, Serialize};
use structopt::StructOpt;
#[derive(Debug, Default, Serialize, Deserialize)]
struct DebugTool;
impl Tool for DebugTool {
fn subscriptions(_cfg: &()) -> Subscription {
Subscription::none()
}
}
/// A tool to introduce inject "chaos" into a running process. A pathological
/// kernel is simulated by forcing reads to only return one byte a time.
#[derive(Debug, StructOpt)]
struct Args {
#[structopt(flatten)]
common_opts: CommonToolArguments,
#[structopt(long, default_value = "1234", help = "launch gdbserver on given port")]
port: u16,
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = Args::from_args();
let port = args.port;
let log_guard = args.common_opts.init_tracing();
eprintln!("Listening on port {}", port);
let tracer = reverie_ptrace::TracerBuilder::<DebugTool>::new(args.common_opts.into())
.gdbserver(port)
.spawn()
.await?;
let (status, _global_state) = tracer.wait().await?;
drop(log_guard); // Flush logs before exiting.
status.raise_or_exit()
}

38
reverie-examples/noop.rs Normal file
View file

@ -0,0 +1,38 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
//! This instrumentation tool intercepts events but does nothing with them. It is
//! useful for observing the overhead of interception, and as a starting point.
use reverie::{Error, Subscription, Tool};
use reverie_util::CommonToolArguments;
use serde::{Deserialize, Serialize};
use structopt::StructOpt;
#[derive(Debug, Default, Serialize, Deserialize)]
struct NoopTool;
#[reverie::tool]
impl Tool for NoopTool {
fn subscriptions(_cfg: &()) -> Subscription {
Subscription::none()
}
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = CommonToolArguments::from_args();
let log_guard = args.init_tracing();
let tracer = reverie_ptrace::TracerBuilder::<NoopTool>::new(args.into())
.spawn()
.await?;
let (status, _global_state) = tracer.wait().await?;
drop(log_guard); // Flush logs before exiting.
status.raise_or_exit()
}

View file

@ -0,0 +1,90 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
//! An example that tracks thread pedigree using local state
use reverie::{syscalls::Syscall, Error, Guest, Pid, Tool};
use reverie_util::{pedigree::Pedigree, CommonToolArguments};
use serde::{Deserialize, Serialize};
use structopt::StructOpt;
use tracing::{debug, trace};
// TODO: Add handle pedigree forking, initialization, etc. to tool.
// This tool is NOT FUNCTIONAL in its current state.
#[derive(Debug, Serialize, Deserialize, Default, Clone)]
struct PedigreeLocal(Pedigree);
#[reverie::tool]
impl Tool for PedigreeLocal {
type ThreadState = PedigreeLocal;
fn new(pid: Pid, _cfg: &()) -> Self {
debug!("[pedigree] initialize pedigree for pid {}", pid);
PedigreeLocal(Pedigree::new())
}
fn init_thread_state(
&self,
_tid: Pid,
parent: Option<(Pid, &Self::ThreadState)>,
) -> Self::ThreadState {
if let Some((_, state)) = parent {
let mut parent = state.clone();
let child = parent.0.fork_mut();
trace!("child pedigree: {:?}", child);
PedigreeLocal(child)
} else {
PedigreeLocal(Pedigree::new())
}
}
async fn handle_syscall_event<T: Guest<Self>>(
&self,
guest: &mut T,
syscall: Syscall,
) -> Result<i64, Error> {
match syscall {
Syscall::Fork(_) | Syscall::Vfork(_) | Syscall::Clone(_) => {
let retval = guest.inject(syscall).await?;
let pedigree = guest.thread_state_mut().0.fork_mut();
trace!(
"got new pedigree: {:?} => {:x?}",
pedigree,
nix::unistd::Pid::try_from(&pedigree)
);
Ok(retval)
}
Syscall::Getpid(_)
| Syscall::Getppid(_)
| Syscall::Gettid(_)
| Syscall::Getpgid(_)
| Syscall::Getpgrp(_) => {
let pid = guest.inject(syscall).await?;
let vpid = nix::unistd::Pid::try_from(&self.0).unwrap();
trace!("getpid returned {:?} vpid: {:?}", pid, vpid);
Ok(pid)
}
Syscall::Setpgid(_) => {
panic!("[pedigree] setpgid is not allowed.");
}
_ => guest.tail_inject(syscall).await,
}
}
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = CommonToolArguments::from_args();
let log_guard = args.init_tracing();
let tracer = reverie_ptrace::TracerBuilder::<PedigreeLocal>::new(args.into())
.spawn()
.await?;
let (status, _global_state) = tracer.wait().await?;
drop(log_guard); // Flush logs before exiting.
status.raise_or_exit()
}

View file

@ -0,0 +1,17 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::filter::Filter;
use serde::{Deserialize, Serialize};
#[derive(Clone, Default, Serialize, Deserialize)]
pub struct Config {
pub filters: Vec<Filter>,
}

View file

@ -0,0 +1,79 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use reverie::syscalls::Sysno;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Deserialize, Serialize, Eq, PartialEq)]
pub struct Filter {
/// Inverses the match.
pub inverse: bool,
/// The set of syscalls to match.
pub syscalls: Vec<Sysno>,
}
impl std::str::FromStr for Filter {
type Err = String;
// Must parse this: [!][?]value1[,[?]value2]...
fn from_str(s: &str) -> Result<Self, Self::Err> {
let (inverse, s) = match s.strip_prefix('!') {
Some(s) => (true, s),
None => (false, s),
};
let mut syscalls = Vec::new();
for value in s.split(',') {
// FIXME: Handle syscall sets, so we can use '%stat` to trace all
// stat calls, for example.
if value.strip_prefix('%').is_some() {
return Err("filtering sets of syscall is not yet supported".into());
}
let syscall: Sysno = value
.parse()
.map_err(|()| format!("invalid syscall name '{}'", value))?;
syscalls.push(syscall);
}
Ok(Self { inverse, syscalls })
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parse_filter() {
assert_eq!(
"open,mmap".parse(),
Ok(Filter {
inverse: false,
syscalls: vec![Sysno::open, Sysno::mmap]
})
);
assert_eq!(
"open,foobar".parse::<Filter>(),
Err("invalid syscall name 'foobar'".into())
);
assert_eq!(
"!read,write".parse(),
Ok(Filter {
inverse: true,
syscalls: vec![Sysno::read, Sysno::write]
})
);
}
}

View file

@ -0,0 +1,23 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use reverie::{GlobalTool, Pid};
use crate::config::Config;
#[derive(Debug, Default)]
pub struct GlobalState;
#[reverie::global_tool]
impl GlobalTool for GlobalState {
type Request = ();
type Response = ();
type Config = Config;
async fn receive_rpc(&self, _pid: Pid, _req: Self::Request) {}
}

View file

@ -0,0 +1,57 @@
/*
* Copyright (c) 2018-2019, Trustees of Indiana University
* ("University Works" via Baojun Wang)
* Copyright (c) 2018-2019, Ryan Newton
* ("Traditional Works of Scholarship")
* Copyright (c) 2020-, Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
mod config;
mod filter;
mod global_state;
mod tool;
use config::Config;
use filter::Filter;
use tool::Strace;
use structopt::StructOpt;
use reverie::Error;
use reverie_util::CommonToolArguments;
/// A tool to trace system calls.
#[derive(StructOpt, Debug)]
struct Opts {
#[structopt(flatten)]
common: CommonToolArguments,
/// The set of syscalls to trace. By default, all syscalls are traced. If
/// this is used, then only the specified syscalls are traced. By limiting
/// the set of traced syscalls, we can reduce the overhead of the tracer.
#[structopt(long)]
trace: Vec<Filter>,
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = Opts::from_args();
let config = Config {
filters: args.trace,
};
let log_guard = args.common.init_tracing();
let tracer = reverie_ptrace::TracerBuilder::<Strace>::new(args.common.into())
.config(config)
.spawn()
.await?;
let (status, _) = tracer.wait().await?;
drop(log_guard); // Flush logs before exiting.
status.raise_or_exit()
}

View file

@ -0,0 +1,115 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::config::Config;
use crate::global_state::GlobalState;
use reverie::syscalls::{Displayable, Errno, Syscall, SyscallInfo};
use reverie::{Error, Guest, Signal, Subscription, Tool};
use serde::{Deserialize, Serialize};
// Strace has no need for process-level state, so this is a unit struct.
#[derive(Debug, Serialize, Deserialize, Default, Clone)]
pub struct Strace;
/// Here we use the same dummy type for both our local and global trait
/// implementations.
#[reverie::tool]
impl Tool for Strace {
type GlobalState = GlobalState;
fn subscriptions(cfg: &Config) -> Subscription {
// Check if we're only excluding things.
let exclude_only = cfg.filters.iter().all(|f| f.inverse);
let mut subs = if exclude_only {
// Only excluding syscalls.
Subscription::all_syscalls()
} else {
// Only including syscalls.
Subscription::none()
};
for filter in &cfg.filters {
let syscalls = filter.syscalls.iter().copied();
if filter.inverse {
subs.disable_syscalls(syscalls);
} else {
subs.syscalls(syscalls);
}
}
subs
}
async fn handle_syscall_event<T: Guest<Self>>(
&self,
guest: &mut T,
syscall: Syscall,
) -> Result<i64, Error> {
match syscall {
Syscall::Exit(_) | Syscall::ExitGroup(_) => {
eprintln!(
"[pid {}] {} = ?",
guest.tid().colored(),
syscall.display_with_outputs(&guest.memory()),
);
guest.tail_inject(syscall).await
}
Syscall::Execve(_) | Syscall::Execveat(_) => {
let tid = guest.tid();
// must be pre-formatted, otherwise the memory references become
// invalid when execve/execveat returns success because the original
// program got wiped out.
eprintln!(
"[pid {}] {}",
tid.colored(),
syscall.display_with_outputs(&guest.memory())
);
let errno = guest.inject(syscall).await.unwrap_err();
eprintln!(
"[pid {}] ({}) = {:?}",
tid.colored(),
syscall.number(),
errno
);
Err(errno.into())
}
_otherwise => {
let syscall_ret = guest.inject(syscall).await;
eprintln!(
"[pid {}] {} = {}",
guest.tid().colored(),
syscall.display_with_outputs(&guest.memory()),
// TODO: Pretty print the return value according to its type.
syscall_ret.unwrap_or_else(|errno| -errno.into_raw() as i64)
);
Ok(syscall_ret?)
}
}
}
async fn handle_signal_event<G: Guest<Self>>(
&self,
guest: &mut G,
signal: Signal,
) -> Result<Option<Signal>, Errno> {
eprintln!(
"[pid {}] Received signal: {}",
guest.tid().colored(),
signal
);
Ok(Some(signal))
}
}

View file

@ -0,0 +1,46 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use reverie::{
syscalls::{Displayable, Syscall},
Error, Guest, Tool,
};
use reverie_util::CommonToolArguments;
use serde::{Deserialize, Serialize};
use structopt::StructOpt;
#[derive(Serialize, Deserialize, Default)]
struct StraceTool {}
#[reverie::tool]
impl Tool for StraceTool {
async fn handle_syscall_event<T: Guest<Self>>(
&self,
guest: &mut T,
syscall: Syscall,
) -> Result<i64, Error> {
eprintln!(
"[pid {}] {} = ?",
guest.tid(),
syscall.display_with_outputs(&guest.memory()),
);
guest.tail_inject(syscall).await
}
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let args = CommonToolArguments::from_args();
let log_guard = args.init_tracing();
let tracer = reverie_ptrace::TracerBuilder::<StraceTool>::new(args.into())
.spawn()
.await?;
let (status, _) = tracer.wait().await?;
drop(log_guard); // Flush logs before exiting.
status.raise_or_exit()
}

View file

@ -0,0 +1,26 @@
# @generated by autocargo
[package]
name = "reverie-process"
version = "0.1.0"
authors = ["Facebook"]
edition = "2021"
license = "BSD-2-Clause"
[dependencies]
bincode = "1.3.3"
bitflags = "1.3"
colored = "1.9"
futures = { version = "0.3.13", features = ["async-await", "compat"] }
libc = "0.2.98"
nix = "0.22"
serde = { version = "1.0.126", features = ["derive", "rc"] }
syscalls = { version = "0.4.2", features = ["with-serde"] }
thiserror = "1.0.29"
tokio = { version = "1.10", features = ["full", "test-util", "tracing"] }
[dev-dependencies]
const-cstr = "0.3.0"
num_cpus = "1.11"
raw-cpuid = "9.0"
tempfile = "3.2"

View file

@ -0,0 +1,760 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use std::borrow::Cow;
use std::collections::BTreeMap;
use std::ffi::{OsStr, OsString};
use std::io;
use std::os::unix::ffi::OsStrExt;
use std::os::unix::fs::PermissionsExt;
use std::path::{Path, PathBuf};
use syscalls::Errno;
use super::seccomp;
use super::util::to_cstring;
use super::util::CStringArray;
use super::Command;
use super::Container;
use super::Mount;
use super::Namespace;
use super::PtyChild;
use super::Stdio;
impl Command {
/// Constructs a new `Command` for launching the program at path `program`,
/// with the following default configuration:
///
/// * No arguments to the program
/// * Inherit the current process's environment
/// * Inherit the current process's working directory
/// * Inherit stdin/stdout/stderr for `spawn` or `status`, but create pipes
/// for `output`
///
/// Builder methods are provided to change these defaults and
/// otherwise configure the process.
///
/// If `program` is not an absolute path, the `PATH` will be searched in an
/// OS-defined way.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Command;
/// let command = Command::new("sh");
/// ```
pub fn new<S: AsRef<OsStr>>(program: S) -> Self {
let program = to_cstring(program);
let mut args = CStringArray::with_capacity(1);
args.push(program.clone());
Self {
program,
args,
pre_exec: Vec::new(),
container: Container::new(),
}
}
/// Sets the path to the program. This can be used to override what was
/// already set in [`Command::new`].
///
/// NOTE: This also changes argument 0 to match `program`.
pub fn program<S: AsRef<OsStr>>(&mut self, program: S) -> &mut Self {
let cstring = to_cstring(program);
self.program = cstring.clone();
self.args.set(0, cstring);
self
}
/// Explicitly sets the first argument. By default, this is the same as the
/// program path and is what you want in most cases.
pub fn arg0<S: AsRef<OsStr>>(&mut self, arg0: S) -> &mut Self {
self.args.set(0, to_cstring(arg0));
self
}
/// Gets the first argument. Unless [`Command::arg0`] was used, this returns
/// the same string as [`Command::get_program`].
pub fn get_arg0(&self) -> &OsStr {
OsStr::from_bytes(self.args.get(0).to_bytes())
}
/// Adds an argument to pass to the program.
///
/// Only one argument can be passed per use. So instead of:
///
/// ```no_run
/// reverie_process::Command::new("sh")
/// .arg("-C /path/to/repo");
/// ```
///
/// usage would be:
///
/// ```no_run
/// reverie_process::Command::new("sh")
/// .arg("-C")
/// .arg("/path/to/repo");
/// ```
///
/// To pass multiple arguments see [`args`].
///
/// [`args`]: method@Self::args
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("ls")
/// .arg("-l")
/// .arg("-a");
/// ```
pub fn arg<S: AsRef<OsStr>>(&mut self, arg: S) -> &mut Self {
self.args.push(to_cstring(arg));
self
}
/// Adds multiple arguments to pass to the program.
///
/// To pass a single argument see [`arg`].
///
/// [`arg`]: method@Self::arg
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("ls")
/// .args(&["-l", "-a"]);
/// ```
pub fn args<I, S>(&mut self, args: I) -> &mut Command
where
I: IntoIterator<Item = S>,
S: AsRef<OsStr>,
{
for arg in args {
self.arg(arg);
}
self
}
/// Returns an iterator of the arguments that will be passed to the program.
///
/// This does not include the program name itself. It only includes the
/// arguments specified with [`Command::arg`] and [`Command::args`].
pub fn get_args(&self) -> impl Iterator<Item = &OsStr> {
self.args
.iter()
.skip(1)
.map(|arg| OsStr::from_bytes(arg.to_bytes()))
}
/// Inserts or updates an environment variable mapping.
///
/// Note that environment variable names are case-insensitive (but
/// case-preserving) on Windows, and case-sensitive on all other platforms.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("ls")
/// .env("PATH", "/bin");
/// ```
pub fn env<K, V>(&mut self, key: K, val: V) -> &mut Self
where
K: AsRef<OsStr>,
V: AsRef<OsStr>,
{
self.container.env(key, val);
self
}
/// Adds or updates multiple environment variable mappings.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::{Command, Stdio};
/// use std::env;
/// use std::collections::HashMap;
///
/// let filtered_env : HashMap<String, String> =
/// env::vars().filter(|&(ref k, _)|
/// k == "TERM" || k == "TZ" || k == "LANG" || k == "PATH"
/// ).collect();
///
/// let command = Command::new("printenv")
/// .stdin(Stdio::null())
/// .stdout(Stdio::inherit())
/// .env_clear()
/// .envs(&filtered_env);
/// ```
pub fn envs<I, K, V>(&mut self, vars: I) -> &mut Self
where
I: IntoIterator<Item = (K, V)>,
K: AsRef<OsStr>,
V: AsRef<OsStr>,
{
self.container.envs(vars);
self
}
/// Removes an environment variable mapping.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("ls")
/// .env_remove("PATH");
/// ```
pub fn env_remove<K: AsRef<OsStr>>(&mut self, key: K) -> &mut Self {
self.container.env_remove(key);
self
}
/// Clears the entire environment map for the child process.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("ls")
/// .env_clear();
/// ```
pub fn env_clear(&mut self) -> &mut Self {
self.container.env_clear();
self
}
/// Sets the working directory for the child process.
///
/// # Interaction with `chroot`
///
/// The working directory is set *after* the chroot is performed (if a chroot
/// directory is specified). Thus, the path given is relative to the chroot
/// directory. Otherwise, if no chroot directory is specified, the working
/// directory is relative to the current working directory of the parent
/// process at the time the child process is spawned.
///
/// # Platform-specific behavior
///
/// If the program path is relative (e.g., `"./script.sh"`), it's ambiguous
/// whether it should be interpreted relative to the parent's working
/// directory or relative to `current_dir`. The behavior in this case is
/// platform specific and unstable, and it's recommended to use
/// [`canonicalize`] to get an absolute program path instead.
///
/// [`canonicalize`]: std::fs::canonicalize()
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("ls")
/// .current_dir("/bin");
/// ```
pub fn current_dir<P: AsRef<Path>>(&mut self, dir: P) -> &mut Self {
self.container.current_dir(dir);
self
}
/// Sets configuration for the child process's standard input (stdin) handle.
///
/// Defaults to [`Stdio::inherit`] when used with `spawn` or `status`, and
/// defaults to [`Stdio::piped`] when used with `output`.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::{Command, Stdio};
///
/// let command = Command::new("ls")
/// .stdin(Stdio::null());
/// ```
pub fn stdin<T: Into<Stdio>>(&mut self, cfg: T) -> &mut Self {
self.container.stdin(cfg);
self
}
/// Sets configuration for the child process's standard output (stdout)
/// handle.
///
/// Defaults to [`Stdio::inherit`] when used with `spawn` or `status`, and
/// defaults to [`Stdio::piped`] when used with `output`.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::{Command, Stdio};
///
/// let command = Command::new("ls")
/// .stdout(Stdio::null());
/// ```
pub fn stdout<T: Into<Stdio>>(&mut self, cfg: T) -> &mut Self {
self.container.stdout(cfg);
self
}
/// Sets configuration for the child process's standard error (stderr)
/// handle.
///
/// Defaults to [`Stdio::inherit`] when used with `spawn` or `status`, and
/// defaults to [`Stdio::piped`] when used with `output`.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::{Command, Stdio};
///
/// let command = Command::new("ls")
/// .stderr(Stdio::null());
/// ```
pub fn stderr<T: Into<Stdio>>(&mut self, cfg: T) -> &mut Self {
self.container.stderr(cfg);
self
}
/// Changes the root directory of the calling process to the specified path.
/// This directory will be inherited by all child processes of the calling
/// process.
///
/// Note that changing the root directory may cause the program to not be
/// found. As such, the program path should be relative to this directory.
pub fn chroot<P: AsRef<Path>>(&mut self, chroot: P) -> &mut Self {
self.container.chroot(chroot);
self
}
/// Unshares parts of the process execution context that are normally shared
/// with the parent process. This is useful for executing the child process
/// in a new namespace.
pub fn unshare(&mut self, namespace: Namespace) -> &mut Self {
self.container.unshare(namespace);
self
}
/// Schedules a closure to be run just before the `exec` function is invoked.
///
/// The closure is allowed to return an I/O error whose OS error code will be
/// communicated back to the parent and returned as an error from when the
/// spawn was requested.
///
/// Multiple closures can be registered and they will be called in order of
/// their registration. If a closure returns `Err` then no further closures
/// will be called and the spawn operation will immediately return with a
/// failure.
///
/// # Safety
///
/// This closure will be run in the context of the child process after a
/// `fork`. This primarily means that any modifications made to memory on
/// behalf of this closure will **not** be visible to the parent process.
/// This is often a very constrained environment where normal operations like
/// `malloc` or acquiring a mutex are not guaranteed to work (due to other
/// threads perhaps still running when the `fork` was run).
///
/// This also means that all resources such as file descriptors and
/// memory-mapped regions got duplicated. It is your responsibility to make
/// sure that the closure does not violate library invariants by making
/// invalid use of these duplicates.
///
/// When this closure is run, aspects such as the stdio file descriptors and
/// working directory have successfully been changed, so output to these
/// locations may not appear where intended.
pub unsafe fn pre_exec<F>(&mut self, f: F) -> &mut Self
where
F: FnMut() -> Result<(), Errno> + Send + Sync + 'static,
{
self.pre_exec.push(Box::new(f));
self
}
/// Returns the path to the program that was given to [`Command::new`].
///
/// # Examples
///
/// ```
/// use reverie_process::Command;
///
/// let cmd = Command::new("echo");
/// assert_eq!(cmd.get_program(), "echo");
/// ```
pub fn get_program(&self) -> &OsStr {
OsStr::from_bytes(self.program.to_bytes())
}
/// Returns the working directory for the child process.
///
/// This returns None if the working directory will not be changed.
pub fn get_current_dir(&self) -> Option<&Path> {
self.container.get_current_dir()
}
/// Returns an iterator of the environment variables that will be set when
/// the process is spawned. Note that this does not include any environment
/// variables inherited from the parent process.
pub fn get_envs(&self) -> impl Iterator<Item = (&OsStr, Option<&OsStr>)> {
self.container.get_envs()
}
/// Returns a mapping of all environment variables that the new child process
/// will inherit.
pub fn get_captured_envs(&self) -> BTreeMap<OsString, OsString> {
self.container.get_captured_envs()
}
/// Gets an environment variable. If the child process is to inherit this
/// environment variable from the current process, then this returns the
/// current process's environment variable unless it is to be overridden.
pub fn get_env<K: AsRef<OsStr>>(&self, env: K) -> Option<Cow<OsStr>> {
self.container.get_env(env)
}
/// Maps one user ID to another.
///
/// Implies `Namespace::USER`.
///
/// # Example
///
/// This is can be used to gain `CAP_SYS_ADMIN` privileges in the user
/// namespace by mapping the root user inside the container to the current
/// user outside of the container.
///
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("ls")
/// .map_uid(1, unsafe { libc::getuid() });
/// ```
///
/// # Implementation
///
/// This modifies `/proc/{pid}/uid_map` where `{pid}` is the PID of the child
/// process. See [`user_namespaces(7)`] for more details.
///
/// [`user_namespaces(7)`]: https://man7.org/linux/man-pages/man7/user_namespaces.7.html
pub fn map_uid(&mut self, inside_uid: libc::uid_t, outside_uid: libc::uid_t) -> &mut Self {
self.container.map_uid(inside_uid, outside_uid);
self
}
/// Maps potentially many user IDs inside the new user namespace to user IDs
/// outside of the user namespace.
///
/// Implies `Namespace::USER`.
///
/// # Implementation
///
/// This modifies `/proc/{pid}/uid_map` where `{pid}` is the PID of the child
/// process. See [`user_namespaces(7)`] for more details.
///
/// [`user_namespaces(7)`]: https://man7.org/linux/man-pages/man7/user_namespaces.7.html
pub fn map_uid_range(
&mut self,
starting_inside_uid: libc::uid_t,
starting_outside_uid: libc::uid_t,
count: u32,
) -> &mut Self {
self.container
.map_uid_range(starting_inside_uid, starting_outside_uid, count);
self
}
/// Convience function for mapping root (inside the container) to the current
/// user ID (outside the container). This is useful for gaining new
/// capabilities inside the container, such as being able to mount file
/// systems.
///
/// Implies `Namespace::USER`.
///
/// This is the same as:
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("ls")
/// .map_uid(0, unsafe { libc::geteuid() })
/// .map_gid(0, unsafe { libc::getegid() });
/// ```
pub fn map_root(&mut self) -> &mut Self {
self.container.map_root();
self
}
/// Maps one group ID to another.
///
/// Implies `Namespace::USER`.
///
/// # Implementation
///
/// This modifies `/proc/{pid}/gid_map` where `{pid}` is the PID of the child
/// process. See [`user_namespaces(7)`] for more details.
///
/// [`user_namespaces(7)`]: https://man7.org/linux/man-pages/man7/user_namespaces.7.html
pub fn map_gid(&mut self, inside_gid: libc::gid_t, outside_gid: libc::gid_t) -> &mut Self {
self.container.map_gid(inside_gid, outside_gid);
self
}
/// Maps potentially many group IDs inside the new user namespace to group
/// IDs outside of the user namespace.
///
/// Implies `Namespace::USER`.
///
/// # Implementation
///
/// This modifies `/proc/{pid}/gid_map` where `{pid}` is the PID of the child
/// process. See [`user_namespaces(7)`] for more details.
///
/// [`user_namespaces(7)`]: https://man7.org/linux/man-pages/man7/user_namespaces.7.html
pub fn map_gid_range(
&mut self,
starting_inside_gid: libc::gid_t,
starting_outside_gid: libc::gid_t,
count: u32,
) -> &mut Self {
self.container
.map_gid_range(starting_inside_gid, starting_outside_gid, count);
self
}
/// Sets the hostname of the container.
///
/// Implies `Namespace::UTS`, which requires `CAP_SYS_ADMIN`.
///
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("cat")
/// .arg("/proc/sys/kernel/hostname")
/// .map_root()
/// .hostname("foobar.local");
/// ```
pub fn hostname<S: Into<OsString>>(&mut self, hostname: S) -> &mut Self {
self.container.hostname(hostname);
self
}
/// Sets the domain name of the container.
///
/// Implies `Namespace::UTS`, which requires `CAP_SYS_ADMIN`.
///
/// # Example
///
/// ```no_run
/// use reverie_process::Command;
///
/// let command = Command::new("cat")
/// .arg("/proc/sys/kernel/domainname")
/// .map_root()
/// .domainname("foobar");
/// ```
pub fn domainname<S: Into<OsString>>(&mut self, domainname: S) -> &mut Self {
self.container.domainname(domainname);
self
}
/// Gets the hostname of the container.
pub fn get_hostname(&self) -> Option<&OsStr> {
self.container.get_hostname()
}
/// Gets the domainname of the container.
pub fn get_domainname(&self) -> Option<&OsStr> {
self.container.get_domainname()
}
/// Adds a file system to be mounted. Note that these are mounted in the same
/// order as given.
///
/// Implies `Namespace::MOUNT`. Note that `Namespace::USER` should also have
/// been set and `map_uid` should have been called in order to gain the
/// privileges required to mount.
pub fn mount(&mut self, mount: Mount) -> &mut Self {
self.container.mount(mount);
self
}
/// Adds multiple mounts.
pub fn mounts<I>(&mut self, mounts: I) -> &mut Self
where
I: IntoIterator<Item = Mount>,
{
self.container.mounts(mounts);
self
}
/// Sets up the container to have local networking only. This will prevent
/// any network communication to the outside world.
///
/// Implies `Namespace::NETWORK` and `Namespace::MOUNT`.
///
/// This also causes a fresh `/sys` to be mounted to avoid seeing the host
/// network interfaces in `/sys/class/net`.
pub fn local_networking_only(&mut self) -> &mut Self {
self.container.local_networking_only();
self
}
/// Sets the seccomp filter. The filter is loaded immediately before `execve`
/// and *after* all `pre_exec` callbacks have been executed. Thus, you will
/// still be able to call filtered syscalls from `pre_exec` callbacks.
pub fn seccomp(&mut self, filter: seccomp::Filter) -> &mut Self {
self.container.seccomp(filter);
self
}
/// Sets the controlling pseudoterminal for the child process).
///
/// In the child process, this has the effect of:
/// 1. Creating a new session (with `setsid()`).
/// 2. Using an `ioctl` to set the controlling terminal.
/// 3. Setting this file descriptor as the stdio streams.
///
/// NOTE: Since this modifies the stdio streams, calling this will reset
/// [`Self::stdin`], [`Self::stdout`], and [`Self::stderr`] back to
/// [`Stdio::inherit()`].
pub fn pty(&mut self, child: PtyChild) -> &mut Self {
self.container.pty(child);
self
}
/// Finds the path to the program.
pub fn find_program(&self) -> io::Result<PathBuf> {
let program = Path::new(self.get_program());
if program.is_absolute() {
// Note: We shouldn't canonicalize here since that will follow
// symlinks. Instead, just make sure the file exists and is
// executable.
let metadata = program.metadata()?;
if metadata.is_file() && metadata.permissions().mode() & 0o111 != 0 {
Ok(program.to_path_buf())
} else {
Err(Errno::EPERM.into())
}
} else if program.components().count() == 1 {
let path = self.get_env("PATH").unwrap_or_default();
let paths = path
.as_bytes()
.split(|c| *c == b':')
.map(|bytes| Path::new(OsStr::from_bytes(bytes)));
find_program_in_paths(program, paths)
.ok_or_else(|| {
io::Error::new(
io::ErrorKind::Other,
format!("Could not find {:?} in $PATH", program),
)
})?
.canonicalize()
} else {
// Assume it's in the current directory
let mut path = match self.get_current_dir() {
Some(path) => path.to_owned(),
None => std::env::current_dir()?,
};
path.push(program);
path.canonicalize()
}
}
}
fn find_program_in_paths<I, S>(program: &Path, iter: I) -> Option<PathBuf>
where
I: IntoIterator<Item = S>,
S: AsRef<Path>,
{
for path in iter.into_iter() {
let path = path.as_ref().join(program);
if let Ok(metadata) = path.metadata() {
if metadata.is_file() {
if metadata.permissions().mode() & 0o111 != 0 {
return Some(path);
} else {
continue;
}
#[cfg(not(unix))]
return Some(path);
}
}
}
None
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn find_program() {
assert!(Command::new("cat").find_program().unwrap().is_absolute(),);
}
#[test]
fn get_program() {
assert_eq!(Command::new("cat").get_program(), "cat");
}
#[test]
fn get_arg0() {
assert_eq!(Command::new("cat").get_arg0(), "cat");
assert_eq!(Command::new("cat").arg0("dog").get_arg0(), "dog");
assert_eq!(
Command::new("cat").arg0("dog").program("catdog").get_arg0(),
"catdog"
);
}
#[test]
fn get_args() {
assert_eq!(
Command::new("cat")
.arg("a")
.arg("b")
.arg("c")
.get_args()
.collect::<Vec<_>>(),
vec![OsStr::new("a"), OsStr::new("b"), OsStr::new("c")]
);
}
}

View file

@ -0,0 +1,259 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use super::ExitStatus;
use super::Pid;
use super::stdio::{ChildStderr, ChildStdin, ChildStdout, Stdio};
use super::Command;
use core::fmt;
use core::future::Future;
use core::pin::Pin;
use core::task::{Context, Poll};
use nix::sys::signal::Signal;
use serde::{Deserialize, Serialize};
use std::io;
use syscalls::Errno;
/// Represents a child process.
///
/// NOTE: The child process is not killed or waited on when `Child` is dropped.
/// If `Child` is not waited on before dropped, the child will continue to run in
/// the background and may become a "zombie" after the parent exits. It is
/// therefore best practice to always wait on child processes.
#[derive(Debug)]
pub struct Child {
/// The child's process ID.
pub(super) pid: Pid,
/// The child's exit status. `Some` if the child has exited already, `None`
/// otherwise.
pub(super) exit_status: Option<ExitStatus>,
/// The handle for writing to the child's standard input (stdin), if it has
/// been captured.
pub stdin: Option<ChildStdin>,
/// The handle for reading from the child's standard output (stdout), if it
/// has been captured.
pub stdout: Option<ChildStdout>,
/// The handle for reading from the child's standard error (stderr), if it
/// has been captured.
pub stderr: Option<ChildStderr>,
}
/// The output of a finished process.
#[derive(PartialEq, Eq, Clone, Serialize, Deserialize)]
pub struct Output {
/// The exit status of the process.
pub status: ExitStatus,
/// The bytes that the process wrote to stdout.
pub stdout: Vec<u8>,
/// The bytes that the process wrote to stderr.
pub stderr: Vec<u8>,
}
impl fmt::Debug for Output {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let stdout = core::str::from_utf8(&self.stdout);
let stdout: &dyn fmt::Debug = match stdout {
Ok(ref s) => s,
Err(_) => &self.stdout,
};
let stderr = core::str::from_utf8(&self.stderr);
let stderr: &dyn fmt::Debug = match stderr {
Ok(ref s) => s,
Err(_) => &self.stderr,
};
f.debug_struct("Output")
.field("status", &self.status)
.field("stdout", stdout)
.field("stderr", stderr)
.finish()
}
}
impl Child {
/// Returns the PID of the child.
pub fn id(&self) -> Pid {
self.pid
}
/// Attempts to collect the exit status of the child if it has already
/// exited.
pub fn try_wait(&mut self) -> io::Result<Option<ExitStatus>> {
match self.exit_status {
Some(exit_status) => Ok(Some(exit_status)),
None => {
let mut status = 0;
let ret = Errno::result(unsafe {
libc::waitpid(self.pid.as_raw(), &mut status, libc::WNOHANG)
})?;
if ret == 0 {
Ok(None)
} else {
let exit_status = ExitStatus::from_raw(status);
self.exit_status = Some(exit_status);
Ok(Some(exit_status))
}
}
}
}
/// Waits for the child to exit completely, returning its exit status. This
/// function will continue to return the same exit status after the child
/// process has fully exited.
///
/// To avoid deadlocks, the child's stdin handle, if any, will be closed
/// before waiting. Otherwise, the child could block waiting for input from
/// the parent while the parent is waiting for the child. To keep the stdin
/// handle open and control it explicitly, the caller can `.take()` it before
/// calling `.wait()`.
pub async fn wait(&mut self) -> io::Result<ExitStatus> {
// Ensure stdin is closed.
drop(self.stdin.take());
WaitForChild::new(self)?.await
}
/// Blocks until the child process exits.
pub fn wait_blocking(&mut self) -> io::Result<ExitStatus> {
drop(self.stdin.take());
let mut status = 0;
let ret = loop {
match Errno::result(unsafe { libc::waitpid(self.pid.as_raw(), &mut status, 0) }) {
Ok(ret) => break ret,
Err(Errno::EINTR) => continue,
Err(err) => return Err(err.into()),
}
};
debug_assert_ne!(ret, 0);
Ok(ExitStatus::from_raw(status))
}
/// Simultaneously waits for the child to exit and collect all remaining
/// output on the stdout/stderr handles, returning an `Output` instance.
///
/// To avoid deadlocks, the child's stdin handle, if any, will be closed
/// before waiting. Otherwise, the child could block waiting for input from
/// the parent while the parent is waiting for the child.
///
/// By default, stdin, stdout and stderr are inherited from the parent. In
/// order to capture the output into this `Result<Output>` it is necessary to
/// create new pipes between parent and child. Use `stdout(Stdio::piped())`
/// or `stderr(Stdio::piped())`, respectively.
pub async fn wait_with_output(mut self) -> io::Result<Output> {
use futures::future::try_join3;
use tokio::io::{AsyncRead, AsyncReadExt};
async fn read_to_end<A: AsyncRead + Unpin>(io: Option<A>) -> io::Result<Vec<u8>> {
let mut vec = Vec::new();
if let Some(mut io) = io {
io.read_to_end(&mut vec).await?;
}
Ok(vec)
}
let stdout_fut = read_to_end(self.stdout.take());
let stderr_fut = read_to_end(self.stderr.take());
let (status, stdout, stderr) = try_join3(self.wait(), stdout_fut, stderr_fut).await?;
Ok(Output {
status,
stdout,
stderr,
})
}
/// Sends a signal to the child. If the child has already been waited on,
/// this does nothing and returns success.
pub fn signal(&self, sig: Signal) -> io::Result<()> {
if self.exit_status.is_none() {
Errno::result(unsafe { libc::kill(self.pid.as_raw(), sig as i32) })?;
}
Ok(())
}
}
impl Command {
/// Executes the command, waiting for it to finish and collecting its exit
/// status.
pub async fn status(&mut self) -> io::Result<ExitStatus> {
let mut child = self.spawn()?;
// Ensure we close any stdio handles so we can't deadlock waiting on the
// child which may be waiting to read/write to a pipe we're holding.
drop(child.stdin.take());
drop(child.stdout.take());
drop(child.stderr.take());
child.wait().await
}
/// Executes the command, waiting for it to finish while collecting its
/// stdout and stderr into buffers.
pub async fn output(&mut self) -> io::Result<Output> {
self.stdout(Stdio::piped());
self.stderr(Stdio::piped());
let child = self.spawn();
child?.wait_with_output().await
}
}
struct WaitForChild<'a> {
/// Signal future. Used to get notified asynchronously of a child exiting.
signal: tokio::signal::unix::Signal,
child: &'a mut Child,
}
impl<'a> WaitForChild<'a> {
fn new(child: &'a mut Child) -> io::Result<Self> {
use tokio::signal::unix::{signal, SignalKind};
Ok(Self {
signal: signal(SignalKind::child())?,
child,
})
}
}
impl<'a> Future for WaitForChild<'a> {
type Output = io::Result<ExitStatus>;
fn poll(mut self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
loop {
// Register an interest in SIGCHLD signals. We can't just call
// `try_wait` right away. We might miss a signal event if the child
// hasn't exited yet. Thus, we poll the signal stream to tell Tokio
// we're interested in signal events.
let sig = self.signal.poll_recv(cx);
if let Some(status) = self.child.try_wait()? {
return Poll::Ready(Ok(status));
}
if sig.is_pending() {
return Poll::Pending;
}
}
}
}

View file

@ -0,0 +1,47 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use syscalls::Errno;
use super::Pid;
pub fn clone<F>(cb: F, flags: libc::c_int) -> Result<Pid, Errno>
where
F: FnMut() -> i32,
{
let mut stack = [0u8; 4096];
clone_with_stack(cb, flags, &mut stack)
}
pub fn clone_with_stack<F>(cb: F, flags: libc::c_int, stack: &mut [u8]) -> Result<Pid, Errno>
where
F: FnMut() -> i32,
{
type CloneCb<'a> = Box<dyn FnMut() -> i32 + 'a>;
extern "C" fn callback(data: *mut CloneCb) -> libc::c_int {
let cb: &mut CloneCb = unsafe { &mut *data };
(*cb)() as libc::c_int
}
let mut cb: CloneCb = Box::new(cb);
let res = unsafe {
let stack = stack.as_mut_ptr().add(stack.len());
let stack = stack.sub(stack as usize % 16);
libc::clone(
core::mem::transmute(callback as extern "C" fn(*mut Box<dyn FnMut() -> i32>) -> i32),
stack as *mut libc::c_void,
flags,
&mut cb as *mut _ as *mut libc::c_void,
)
};
Errno::result(res).map(Pid::from_raw)
}

View file

@ -0,0 +1,978 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use super::clone::clone_with_stack;
use super::env::Env;
use super::error::{AddContext, Context, Error};
use super::exit_status::ExitStatus;
use super::fd::{pipe, write_bytes, Fd};
use super::id_map::make_id_map;
use super::mount::Mount;
use super::namespace::Namespace;
use super::net::IfName;
use super::pid::Pid;
use super::pty::PtyChild;
use super::seccomp;
use super::stdio::Stdio;
use super::util::reset_signal_handling;
use super::util::to_cstring;
use nix::sched::{sched_setaffinity, CpuSet};
use serde::de::DeserializeOwned;
use serde::Serialize;
use syscalls::Errno;
use std::borrow::Cow;
use std::collections::BTreeMap;
use std::ffi::CString;
use std::ffi::OsStr;
use std::ffi::OsString;
use std::io::Read;
use std::os::unix::ffi::OsStrExt;
use std::os::unix::io::AsRawFd;
use std::path::Path;
/// A `Container` is a configuration of how a process shall be spawned. It can,
/// but doesn't have to, include Linux namespace configuration.
///
/// NOTE: Configuring resource limits via cgroups is not yet supported.
pub struct Container {
pub(super) env: Env,
current_dir: Option<CString>,
chroot: Option<CString>,
pub(super) namespace: Namespace,
pub(super) stdin: Stdio,
pub(super) stdout: Stdio,
pub(super) stderr: Stdio,
pub(super) uid_map: Vec<(libc::uid_t, libc::uid_t, u32)>,
pub(super) gid_map: Vec<(libc::uid_t, libc::uid_t, u32)>,
mounts: Vec<Mount>,
local_networking_only: bool,
hostname: Option<OsString>,
domainname: Option<OsString>,
seccomp: Option<seccomp::Filter>,
pub(super) pty: Option<PtyChild>,
/// The core number to which the new process, and descendents, will be
/// pinned.
affinity: Option<usize>,
}
impl Default for Container {
fn default() -> Self {
Self {
env: Default::default(),
current_dir: None,
chroot: None,
namespace: Default::default(),
stdin: Stdio::inherit(),
stdout: Stdio::inherit(),
stderr: Stdio::inherit(),
uid_map: Vec::new(),
gid_map: Vec::new(),
mounts: Vec::new(),
local_networking_only: false,
hostname: None,
domainname: None,
seccomp: None,
pty: None,
affinity: None,
}
}
}
impl Container {
/// Creates a new `Container` that inherits everything from the parent
/// process.
pub fn new() -> Self {
Self::default()
}
/// Inserts or updates an environment variable mapping.
///
/// Note that environment variable names are case-insensitive (but
/// case-preserving) on Windows, and case-sensitive on all other platforms.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Container;
///
/// let container = Container::new()
/// .env("PATH", "/bin");
/// ```
pub fn env<K, V>(&mut self, key: K, val: V) -> &mut Self
where
K: AsRef<OsStr>,
V: AsRef<OsStr>,
{
self.env.set(key.as_ref(), val.as_ref());
self
}
/// Adds or updates multiple environment variable mappings.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::{Container, Stdio};
/// use std::env;
/// use std::collections::HashMap;
///
/// let filtered_env : HashMap<String, String> =
/// env::vars().filter(|&(ref k, _)|
/// k == "TERM" || k == "TZ" || k == "LANG" || k == "PATH"
/// ).collect();
///
/// let container = Container::new()
/// .stdin(Stdio::null())
/// .stdout(Stdio::inherit())
/// .env_clear()
/// .envs(&filtered_env);
/// ```
pub fn envs<I, K, V>(&mut self, vars: I) -> &mut Self
where
I: IntoIterator<Item = (K, V)>,
K: AsRef<OsStr>,
V: AsRef<OsStr>,
{
for (k, v) in vars.into_iter() {
self.env(k, v);
}
self
}
/// Removes an environment variable mapping.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Container;
///
/// let container = Container::new()
/// .env_remove("PATH");
/// ```
pub fn env_remove<K: AsRef<OsStr>>(&mut self, key: K) -> &mut Self {
self.env.remove(key.as_ref());
self
}
/// Clears the entire environment map for the child process.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Container;
///
/// let container = Container::new()
/// .env_clear();
/// ```
pub fn env_clear(&mut self) -> &mut Self {
self.env.clear();
self
}
/// Sets the working directory for the child process.
///
/// # Interaction with `chroot`
///
/// The working directory is set *after* the chroot is performed (if a chroot
/// directory is specified). Thus, the path given is relative to the chroot
/// directory. Otherwise, if no chroot directory is specified, the working
/// directory is relative to the current working directory of the parent
/// process at the time the child process is spawned.
///
/// # Platform-specific behavior
///
/// If the program path is relative (e.g., `"./script.sh"`), it's ambiguous
/// whether it should be interpreted relative to the parent's working
/// directory or relative to `current_dir`. The behavior in this case is
/// platform specific and unstable, and it's recommended to use
/// [`canonicalize`] to get an absolute program path instead.
///
/// [`canonicalize`]: std::fs::canonicalize()
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::Container;
///
/// let container = Container::new()
/// .current_dir("/bin");
/// ```
pub fn current_dir<P: AsRef<Path>>(&mut self, dir: P) -> &mut Self {
self.current_dir = Some(to_cstring(dir.as_ref()));
self
}
/// Sets configuration for the child process's standard input (stdin) handle.
///
/// Defaults to [`Stdio::inherit`] when used with `spawn` or `status`, and
/// defaults to [`Stdio::piped`] when used with `output`.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::{Container, Stdio};
///
/// let container = Container::new()
/// .stdin(Stdio::null());
/// ```
pub fn stdin<T: Into<Stdio>>(&mut self, cfg: T) -> &mut Self {
self.stdin = cfg.into();
self
}
/// Sets configuration for the child process's standard output (stdout)
/// handle.
///
/// Defaults to [`Stdio::inherit`] when used with `spawn` or `status`, and
/// defaults to [`Stdio::piped`] when used with `output`.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::{Container, Stdio};
///
/// let container = Container::new()
/// .stdout(Stdio::null());
/// ```
pub fn stdout<T: Into<Stdio>>(&mut self, cfg: T) -> &mut Self {
self.stdout = cfg.into();
self
}
/// Sets configuration for the child process's standard error (stderr)
/// handle.
///
/// Defaults to [`Stdio::inherit`] when used with `spawn` or `status`, and
/// defaults to [`Stdio::piped`] when used with `output`.
///
/// # Examples
///
/// Basic usage:
///
/// ```no_run
/// use reverie_process::{Container, Stdio};
///
/// let container = Container::new()
/// .stderr(Stdio::null());
/// ```
pub fn stderr<T: Into<Stdio>>(&mut self, cfg: T) -> &mut Self {
self.stderr = cfg.into();
self
}
/// Changes the root directory of the calling process to the specified path.
/// This directory will be inherited by all child processes of the calling
/// process.
///
/// Note that changing the root directory may cause the program to not be
/// found. As such, the program path should be relative to this directory.
pub fn chroot<P: AsRef<Path>>(&mut self, chroot: P) -> &mut Self {
self.chroot = Some(to_cstring(chroot.as_ref()));
self
}
/// Unshares parts of the process execution context that are normally shared
/// with the parent process. This is useful for executing the child process
/// in a new namespace.
pub fn unshare(&mut self, namespace: Namespace) -> &mut Self {
self.namespace |= namespace;
self
}
/// Returns the working directory for the child process.
///
/// This returns None if the working directory will not be changed.
pub fn get_current_dir(&self) -> Option<&Path> {
if let Some(dir) = &self.current_dir {
Some(Path::new(OsStr::from_bytes(dir.to_bytes())))
} else {
None
}
}
/// Returns an iterator of the environment variables that will be set when
/// the process is spawned. Note that this does not include any environment
/// variables inherited from the parent process.
pub fn get_envs(&self) -> impl Iterator<Item = (&OsStr, Option<&OsStr>)> {
self.env.iter()
}
/// Returns a mapping of all environment variables that the new child process
/// will inherit.
pub fn get_captured_envs(&self) -> BTreeMap<OsString, OsString> {
self.env.capture()
}
/// Gets an environment variable. If the child process is to inherit this
/// environment variable from the current process, then this returns the
/// current process's environment variable unless it is to be overridden.
pub fn get_env<K: AsRef<OsStr>>(&self, env: K) -> Option<Cow<OsStr>> {
self.env.get_captured(env)
}
/// Maps one user ID to another.
///
/// Implies `Namespace::USER`.
///
/// # Example
///
/// This is can be used to gain `CAP_SYS_ADMIN` privileges in the user
/// namespace by mapping the root user inside the container to the current
/// user outside of the container.
///
/// ```no_run
/// use reverie_process::Container;
///
/// let container = Container::new()
/// .map_uid(1, unsafe { libc::getuid() });
/// ```
///
/// # Implementation
///
/// This modifies `/proc/{pid}/uid_map` where `{pid}` is the PID of the child
/// process. See [`user_namespaces(7)`] for more details.
///
/// [`user_namespaces(7)`]: https://man7.org/linux/man-pages/man7/user_namespaces.7.html
pub fn map_uid(&mut self, inside_uid: libc::uid_t, outside_uid: libc::uid_t) -> &mut Self {
self.map_uid_range(inside_uid, outside_uid, 1)
}
/// Maps potentially many user IDs inside the new user namespace to user IDs
/// outside of the user namespace.
///
/// Implies `Namespace::USER`.
///
/// # Implementation
///
/// This modifies `/proc/{pid}/uid_map` where `{pid}` is the PID of the child
/// process. See [`user_namespaces(7)`] for more details.
///
/// [`user_namespaces(7)`]: https://man7.org/linux/man-pages/man7/user_namespaces.7.html
pub fn map_uid_range(
&mut self,
starting_inside_uid: libc::uid_t,
starting_outside_uid: libc::uid_t,
count: u32,
) -> &mut Self {
self.uid_map
.push((starting_inside_uid, starting_outside_uid, count));
self.namespace |= Namespace::USER;
self
}
/// Convience function for mapping root (inside the container) to the current
/// user ID (outside the container). This is useful for gaining new
/// capabilities inside the container, such as being able to mount file
/// systems.
///
/// Implies `Namespace::USER`.
///
/// This is the same as:
/// ```no_run
/// use reverie_process::Container;
///
/// let container = Container::new()
/// .map_uid(0, unsafe { libc::geteuid() })
/// .map_gid(0, unsafe { libc::getegid() });
/// ```
pub fn map_root(&mut self) -> &mut Self {
self.map_uid(0, unsafe { libc::geteuid() });
self.map_gid(0, unsafe { libc::getegid() })
}
/// Maps one group ID to another.
///
/// Implies `Namespace::USER`.
///
/// # Implementation
///
/// This modifies `/proc/{pid}/gid_map` where `{pid}` is the PID of the child
/// process. See [`user_namespaces(7)`] for more details.
///
/// [`user_namespaces(7)`]: https://man7.org/linux/man-pages/man7/user_namespaces.7.html
pub fn map_gid(&mut self, inside_gid: libc::gid_t, outside_gid: libc::gid_t) -> &mut Self {
self.map_gid_range(inside_gid, outside_gid, 1)
}
/// Maps potentially many group IDs inside the new user namespace to group
/// IDs outside of the user namespace.
///
/// Implies `Namespace::USER`.
///
/// # Implementation
///
/// This modifies `/proc/{pid}/gid_map` where `{pid}` is the PID of the child
/// process. See [`user_namespaces(7)`] for more details.
///
/// [`user_namespaces(7)`]: https://man7.org/linux/man-pages/man7/user_namespaces.7.html
pub fn map_gid_range(
&mut self,
starting_inside_gid: libc::gid_t,
starting_outside_gid: libc::gid_t,
count: u32,
) -> &mut Self {
self.namespace |= Namespace::USER;
self.gid_map
.push((starting_inside_gid, starting_outside_gid, count));
self
}
/// Sets the hostname of the container.
///
/// Implies `Namespace::UTS`, which requires `CAP_SYS_ADMIN`.
///
/// ```no_run
/// use reverie_process::Container;
///
/// let container = Container::new()
/// .map_root()
/// .hostname("foobar.local");
/// ```
pub fn hostname<S: Into<OsString>>(&mut self, hostname: S) -> &mut Self {
self.namespace |= Namespace::UTS;
self.hostname = Some(hostname.into());
self
}
/// Sets the domain name of the container.
///
/// Implies `Namespace::UTS`, which requires `CAP_SYS_ADMIN`.
///
/// # Example
///
/// ```no_run
/// use reverie_process::Container;
///
/// let container = Container::new()
/// .map_root()
/// .domainname("foobar");
/// ```
pub fn domainname<S: Into<OsString>>(&mut self, domainname: S) -> &mut Self {
self.namespace |= Namespace::UTS;
self.domainname = Some(domainname.into());
self
}
/// Gets the hostname of the container.
pub fn get_hostname(&self) -> Option<&OsStr> {
self.hostname.as_ref().map(AsRef::as_ref)
}
/// Gets the domainname of the container.
pub fn get_domainname(&self) -> Option<&OsStr> {
self.domainname.as_ref().map(AsRef::as_ref)
}
/// Adds a file system to be mounted. Note that these are mounted in the same
/// order as given.
///
/// Implies `Namespace::MOUNT`. Note that `Namespace::USER` should also have
/// been set and `map_uid` should have been called in order to gain the
/// privileges required to mount.
pub fn mount(&mut self, mount: Mount) -> &mut Self {
self.namespace |= Namespace::MOUNT;
self.mounts.push(mount);
self
}
/// Adds multiple mounts.
pub fn mounts<I>(&mut self, mounts: I) -> &mut Self
where
I: IntoIterator<Item = Mount>,
{
self.namespace |= Namespace::MOUNT;
self.mounts.extend(mounts);
self
}
/// Sets up the container to have local networking only. This will prevent
/// any network communication to the outside world.
///
/// Implies `Namespace::NETWORK` and `Namespace::MOUNT`.
///
/// This also causes a fresh `/sys` to be mounted to avoid seeing the host
/// network interfaces in `/sys/class/net`.
pub fn local_networking_only(&mut self) -> &mut Self {
if !self.local_networking_only {
self.local_networking_only = true;
self.namespace |= Namespace::NETWORK;
self.mount(Mount::sysfs("/sys"));
}
self
}
/// Sets the seccomp filter. The filter is loaded immediately before `execve`
/// and *after* all `pre_exec` callbacks have been executed. Thus, you will
/// still be able to call filtered syscalls from `pre_exec` callbacks.
pub fn seccomp(&mut self, filter: seccomp::Filter) -> &mut Self {
self.seccomp = Some(filter);
self
}
/// Sets the controlling pseudoterminal for the child process).
///
/// In the child process, this has the effect of:
/// 1. Creating a new session (with `setsid()`).
/// 2. Using an `ioctl` to set the controlling terminal.
/// 3. Setting this file descriptor as the stdio streams.
///
/// NOTE: Since this modifies the stdio streams, calling this will reset
/// [`Self::stdin`], [`Self::stdout`], and [`Self::stderr`] back to
/// [`Stdio::inherit()`].
pub fn pty(&mut self, child: PtyChild) -> &mut Self {
self.pty = Some(child);
self.stdin = Stdio::inherit();
self.stdout = Stdio::inherit();
self.stderr = Stdio::inherit();
self
}
/// Sets the CPU to which the child threads/processes will be pinned.
pub fn affinity(&mut self, affinity: usize) -> &mut Self {
self.affinity = Some(affinity);
self
}
/// Called by the child process after `clone` to get itself set up for either
/// `execve` or running an arbitrary function.
///
/// NOTE: Although this function takes `&mut self`, it is only called in the
/// context of the child process (which has a copy-on-write view of the
/// parent's virtual memory). Thus, the parent's version isn't actually
/// modified.
pub(super) fn setup(
&mut self,
context: &ChildContext,
pre_exec: &mut [Box<dyn FnMut() -> Result<(), Errno> + Send + Sync>],
) -> Result<(), Error> {
// NOTE: This function MUST NOT allocate or deallocate any memory! Doing
// so can cause random, difficult to diagnose deadlocks.
if let Some(pty) = self.pty.take() {
// NOTE: This is done *before* setting the stdio streams so that the
// user can still override individual streams if they only want them
// to be partially attached to the tty.
pty.login().context(Context::Tty)?;
}
if let Some(fd) = context.stdin {
fd.dup2(libc::STDIN_FILENO)
.context(Context::Stdio)?
.leave_open();
}
if let Some(fd) = context.stdout {
fd.dup2(libc::STDOUT_FILENO)
.context(Context::Stdio)?
.leave_open();
}
if let Some(fd) = context.stderr {
fd.dup2(libc::STDERR_FILENO)
.context(Context::Stdio)?
.leave_open();
}
unsafe { reset_signal_handling() }.context(Context::ResetSignals)?;
// Set up UID and GID maps.
if !context.uid_map.is_empty() {
context.map_uid().context(Context::MapUid)?;
}
if !context.gid_map.is_empty() {
context.setgroups(false).context(Context::MapGid)?;
context.map_gid().context(Context::MapGid)?;
}
// Set host name, if any.
if let Some(name) = &self.hostname {
Error::result(
unsafe { libc::sethostname(name.as_bytes().as_ptr() as *const _, name.len()) },
Context::Hostname,
)?;
}
// Set domain name, if any.
if let Some(name) = &self.domainname {
Error::result(
unsafe { libc::setdomainname(name.as_bytes().as_ptr() as *const _, name.len()) },
Context::Domainname,
)?;
}
// Mount all the things.
for mount in &mut self.mounts {
mount.mount().context(Context::Mount)?;
}
// Change root directory. Note that we do this *after* mounting anything
// so that bind mounts sources that live outside of the chroot directory
// can work.
if let Some(chroot) = &self.chroot {
Error::result(unsafe { libc::chroot(chroot.as_ptr()) }, Context::Chroot)?;
}
// Set working directory, if any.
if let Some(current_dir) = &self.current_dir {
Error::result(unsafe { libc::chdir(current_dir.as_ptr()) }, Context::Chdir)?;
}
// Configure networking.
// TODO: Generalize this a bit to allow more complex configuration.
if self.local_networking_only {
// Need a socket to access the network interface.
let sock = Fd::socket(libc::AF_INET, libc::SOCK_DGRAM, libc::IPPROTO_IP)
.context(Context::Network)?;
let loopback = IfName::LOOPBACK;
// Bring up the loopback interface in the newly mounted sysfs.
let flags = loopback.get_flags(&sock).context(Context::Network)?;
let flags = flags | libc::IFF_UP as i16;
loopback.set_flags(&sock, flags).context(Context::Network)?;
}
if let Some(cpu) = self.affinity {
let mut cpu_set = CpuSet::new();
cpu_set.set(cpu).context(Context::Affinity)?;
sched_setaffinity(nix::unistd::Pid::from_raw(0), &cpu_set)
.context(Context::Affinity)?;
}
// NOTE: We must call our pre_exec callbacks BEFORE installing the
// seccomp filter because our callbacks could be calling syscalls that
// our seccomp filter may be intending to block.
for f in pre_exec {
f().context(Context::PreExec)?;
}
// Set up the seccomp filter, if any.
if let Some(filter) = &self.seccomp {
filter.load().context(Context::Seccomp)?;
}
Ok(())
}
/// Runs a function in a new process with the specified namespaces unshared. This
/// blocks until the function itself returns and the process has exited.
///
/// # Safety
///
/// - This should be called early on in the life of a process, before any
/// other threads are created. This reduces the chance that any global
/// resources (like the Tokio runtime) have been created yet.
///
/// - Memory allocated in the parent must not be freed in the child,
/// especially if using jemalloc where a separate thread does deallocations.
pub fn run<F, T>(&mut self, mut f: F) -> Result<T, RunError>
where
F: FnMut() -> T,
T: Serialize + DeserializeOwned,
{
let clone_flags = self.namespace.bits() | libc::SIGCHLD;
let uid_map = &make_id_map(&self.uid_map);
let gid_map = &make_id_map(&self.gid_map);
let context = ChildContext {
// TODO: Honor stdio options. For now, always inherit from the
// parent process.
stdin: None,
stdout: None,
stderr: None,
uid_map,
gid_map,
};
// Use a pipe for getting the result of the function out of the child
// process.
let (mut reader, writer) = pipe()?;
let writer_fd = writer.as_raw_fd();
// NOTE: Must use a dynamically allocated stack here. Programs expect to
// have at least 2 MB of stack space and if we've already used up some
// stack space before this is called we could overflow the stack.
let mut stack = vec![0u8; 1024 * 1024 * 2];
// Disable io redirection just before forking. We want the child process to
// be able to call `println!()` and have that output go to stdout.
//
// See: https://github.com/rust-lang/rust/issues/35136
let output_capture = std::io::set_output_capture(None);
let result = clone_with_stack(
|| {
let value = self.setup(&context, &mut []).map(|()| f());
let writer = std::io::BufWriter::new(Fd::new(writer_fd));
// Serialize this result with bincode and send it to the parent
// process via a pipe.
//
// TODO: Handle serialization errors(?)
bincode::serialize_into(writer, &value).expect("Failed to serialize return value");
0
},
clone_flags,
&mut stack,
);
std::io::set_output_capture(output_capture);
let child = WaitGuard::new(result?);
// The writer end must be dropped first so that our reader doesn't block
// forever.
drop(writer);
// Read the return value. Note that we do this *before* waiting on the
// process to exit. Otherwise, for return values that exceed the pipe
// capacity, we would deadlock.
let mut buf = Vec::new();
match reader.read_to_end(&mut buf) {
Ok(0) => {
// The writer end was closed before anything could be written.
// This indicates that the process exited before the return
// value could be serialized. The only thing we can do in this
// case is collect the exit status of the process.
//
// NOTE: Since we always send `Result<T, _>` through the pipe,
// we can guarantee that a successful serialization will never
// be 0 bytes (since it always takes more than 0 bytes to encode
// that type).
//
// NOTE: Since `WaitGuard` is used, we guarantee that the
// process will be waited on in the other cases.
Err(RunError::ExitStatus(child.wait()?))
}
Ok(n) => {
// FIXME: Handle errors
let value: Result<T, ()> = bincode::deserialize(&buf[0..n]).unwrap();
Ok(value.unwrap())
}
Err(err) => {
// FIXME: Handle this error
panic!("Got unexpected error: {}", err)
}
}
}
}
pub(super) struct ChildContext<'a> {
pub stdin: Option<&'a Fd>,
pub stdout: Option<&'a Fd>,
pub stderr: Option<&'a Fd>,
pub uid_map: &'a [u8],
pub gid_map: &'a [u8],
}
impl<'a> ChildContext<'a> {
fn map_uid(&self) -> Result<(), Errno> {
write_bytes(b"/proc/self/uid_map\0", self.uid_map)
}
fn map_gid(&self) -> Result<(), Errno> {
write_bytes(b"/proc/self/gid_map\0", self.gid_map)
}
fn setgroups(&self, allow: bool) -> Result<(), Errno> {
write_bytes(
b"/proc/self/setgroups\0",
if allow { b"allow\0" } else { b"deny\0" },
)
}
}
/// An error that ocurred while running a containerized function.
#[derive(thiserror::Error, Debug, Eq, PartialEq)]
pub enum RunError {
/// An error that occurred while spawning the container.
#[error("Process failed to spawn: {0}")]
Spawn(#[from] Error),
/// The function exited prematurely. This can happen if the function called
/// `std::process::exit(0)`, preventing the return value from being sent to
/// the parent. It can also happen if the process panics.
#[error("Process exited with code: {0:?}")]
ExitStatus(ExitStatus),
}
impl From<Errno> for RunError {
fn from(errno: Errno) -> Self {
Self::Spawn(Error::from(errno))
}
}
// Helper guard for making sure that the process gets waited on even if an error
// is encountered.
struct WaitGuard(Option<Pid>);
impl WaitGuard {
pub fn new(pid: Pid) -> Self {
Self(Some(pid))
}
/// Eagerly waits for the pid. Otherwise, it'll get waited on upon drop.
pub fn wait(mut self) -> Result<ExitStatus, Errno> {
let pid = self.0.take().unwrap();
let mut status = 0;
let ret = Errno::result(unsafe { libc::waitpid(pid.as_raw(), &mut status, 0) })?;
assert_ne!(ret, 0);
Ok(ExitStatus::from_raw(status))
}
}
impl Drop for WaitGuard {
fn drop(&mut self) {
if let Some(pid) = self.0.take() {
let mut status = 0;
unsafe {
libc::waitpid(pid.as_raw(), &mut status, 0);
}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use nix::sys::signal::Signal;
#[test]
fn can_panic() {
assert_eq!(
Container::new().run(|| panic!()),
Err(RunError::ExitStatus(ExitStatus::Signaled(
Signal::SIGABRT,
true
)))
);
}
#[test]
fn is_new_process() {
let my_pid = unsafe { libc::getpid() };
assert_eq!(
Container::new().run(|| {
assert_ne!(unsafe { libc::getpid() }, 1);
assert_ne!(unsafe { libc::getpid() }, my_pid);
assert_eq!(unsafe { libc::getppid() }, my_pid);
}),
Ok(())
);
}
#[test]
fn pid_namespace() {
assert_eq!(
Container::new()
.unshare(Namespace::USER | Namespace::PID)
.run(|| {
// New PID namespace, so this should be the init process.
assert_eq!(unsafe { libc::getpid() }, 1);
}),
Ok(())
);
}
#[test]
fn return_value() {
assert_eq!(Container::new().run(|| 42), Ok(42));
assert_eq!(
Container::new().run(|| String::from("foobar")),
Ok("foobar".into())
);
}
#[test]
fn huge_return_value() {
assert_eq!(
Container::new().run(|| {
// Need something larger than /proc/sys/fs/pipe-max-size, which
// is typically 1MB.
vec![42; 10 * 1024 * 1024 /* 10 MB */]
}),
Ok(vec![42; 10 * 1024 * 1024])
);
}
#[test]
pub fn bind_to_low_port() {
use std::net::Ipv4Addr;
use std::net::SocketAddrV4;
use std::net::TcpListener;
let addr = Container::new()
.map_root()
.local_networking_only()
.run(|| {
let listener = TcpListener::bind("127.0.0.1:80").unwrap();
listener.local_addr().unwrap()
})
.unwrap();
assert_eq!(
addr,
SocketAddrV4::new(Ipv4Addr::new(127, 0, 0, 1), 80).into()
);
}
#[test]
pub fn pin_affinity_to_all_cores() -> Result<(), Error> {
use raw_cpuid::CpuId;
use std::collections::HashMap;
let cpus = num_cpus::get();
println!("Total cpus {}", cpus);
// Map the apic_id to the number of times we observed it:
let mut results: HashMap<u8, usize> = HashMap::new();
for core in 0..cpus {
println!(" Launching guest with affinity set to {}", core);
let mut container = Container::new();
container.affinity(core);
let which_core = container
.run(|| {
let cpuid = CpuId::new();
cpuid
.get_feature_info()
.expect("cpuid failed")
.initial_local_apic_id()
})
.unwrap();
println!(" Guest sees its on APIC id {}", which_core);
*results.entry(which_core).or_default() += 1;
}
println!("Final table size {:?}", results.len());
assert_eq!(results.values().fold(0, |n, v| std::cmp::max(n, *v)), 1);
Ok(())
}
}

108
reverie-process/src/env.rs Normal file
View file

@ -0,0 +1,108 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use std::borrow::Cow;
use std::collections::BTreeMap;
use std::ffi::{CString, OsStr, OsString};
use super::util::CStringArray;
/// A mapping of environment variables.
#[derive(Default, Clone, Debug)]
pub struct Env {
clear: bool,
vars: BTreeMap<OsString, Option<OsString>>,
}
impl Env {
/// Clear out all environment variables, including the ones inherited from
/// the parent process. Any variables set after this are completely new
/// variables.
pub fn clear(&mut self) {
self.clear = true;
self.vars.clear();
}
pub fn is_cleared(&self) -> bool {
self.clear
}
pub fn set(&mut self, key: &OsStr, value: &OsStr) {
self.vars.insert(key.to_owned(), Some(value.to_owned()));
}
pub fn get<K: AsRef<OsStr>>(&self, key: K) -> Option<&OsStr> {
self.vars
.get(key.as_ref())
.and_then(|v| v.as_ref().map(|v| v.as_os_str()))
}
pub fn get_captured<K: AsRef<OsStr>>(&self, key: K) -> Option<Cow<OsStr>> {
let key = key.as_ref();
if !self.clear {
if let Some(var) = std::env::var_os(key) {
return Some(Cow::Owned(var));
}
}
self.get(key).map(Cow::Borrowed)
}
pub fn remove(&mut self, key: &OsStr) {
if self.clear {
self.vars.remove(key);
} else {
self.vars.insert(key.to_owned(), None);
}
}
/// Capture the current environment and merge it with the changes we've
/// applied.
pub fn capture(&self) -> BTreeMap<OsString, OsString> {
let mut env = if self.clear {
BTreeMap::new()
} else {
// Capture from the current environment.
std::env::vars_os().collect()
};
for (k, v) in &self.vars {
if let Some(ref v) = v {
env.insert(k.clone(), v.clone());
} else {
env.remove(k);
}
}
env
}
pub fn array(&self) -> CStringArray {
use std::os::unix::ffi::OsStringExt;
let env = self.capture();
let mut result = CStringArray::with_capacity(env.len());
for (mut k, v) in env {
// Reserve additional space for '=' and null terminator
k.reserve_exact(v.len() + 2);
k.push("=");
k.push(&v);
// Add the new entry into the array
result.push(CString::new(k.into_vec()).unwrap());
}
result
}
pub fn iter(&self) -> impl Iterator<Item = (&OsStr, Option<&OsStr>)> {
self.vars.iter().map(|(k, v)| (k.as_ref(), v.as_deref()))
}
}

View file

@ -0,0 +1,189 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use core::fmt;
use serde::{Deserialize, Serialize};
use syscalls::Errno;
/// Context associated with [`Error`]. Useful for knowing which particular part
/// of [`super::Command::spawn`] failed.
#[derive(Debug, Copy, Clone, Eq, PartialEq, Serialize, Deserialize)]
#[repr(u32)]
pub enum Context {
/// No context provided.
Unknown,
/// Setting CPU affinity failed.
Affinity,
/// The clone syscall failed.
Clone,
/// Setting up the tty failed.
Tty,
/// Setting up stdio failed.
Stdio,
/// Resetting signals failed.
ResetSignals,
/// Changing `/proc/{pid}/uid_map` failed.
MapUid,
/// Changing `/proc/{pid}/setgroups` or `/proc/{pid}/gid_map` failed.
MapGid,
/// Setting the hostname failed.
Hostname,
/// Setting the domainname failed.
Domainname,
/// Chroot failed.
Chroot,
/// Chdir failed.
Chdir,
/// Mounting failed.
Mount,
/// Network configuration failed.
Network,
/// The pre_exec callback(s) failed.
PreExec,
/// Setting the seccomp filter failed.
Seccomp,
/// Exec failed.
Exec,
}
impl Context {
/// Returns a string representation of the context.
pub fn as_str(&self) -> &'static str {
match self {
Self::Unknown => "Unknown failure",
Self::Affinity => "setting cpu affinity failed",
Self::Clone => "clone failed",
Self::Tty => "Setting the controlling tty failed",
Self::Stdio => "Setting up stdio file descriptors failed",
Self::ResetSignals => "Reseting signal handlers failed",
Self::MapUid => "Setting UID map failed",
Self::MapGid => "Setting GID map failed",
Self::Hostname => "Setting hostname failed",
Self::Domainname => "Setting domainname failed",
Self::Chroot => "chroot failed",
Self::Chdir => "chdir failed",
Self::Mount => "mount failed",
Self::Network => "network configuration failed",
Self::PreExec => "pre_exec callback(s) failed",
Self::Seccomp => "failed to install seccomp filter",
Self::Exec => "execvp failed",
}
}
}
impl fmt::Display for Context {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
fmt::Write::write_str(f, self.as_str())
}
}
/// An error from spawning a process. This is a thin wrapper around
/// [`crate::Errno`], but with more context about what went wrong.
#[derive(Debug, Copy, Clone, Eq, PartialEq, Serialize, Deserialize)]
pub struct Error {
errno: Errno,
context: Context,
}
impl Error {
/// Creates a new `Error`.
pub fn new(errno: Errno, context: Context) -> Self {
Self { errno, context }
}
/// Converts a value `S` into an `Error`. Useful for turning `libc` function
/// return types into a `Result`.
pub fn result<S>(value: S, context: Context) -> Result<S, Self>
where
S: syscalls::ErrnoSentinel + PartialEq<S>,
{
Errno::result(value).map_err(|err| Self::new(err, context))
}
/// Gets the errno.
pub fn errno(&self) -> Errno {
self.errno
}
/// Gets the error context.
pub fn context(&self) -> Context {
self.context
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
writeln!(f, "{}: {}", self.context, self.errno)
}
}
impl std::error::Error for Error {}
impl From<Errno> for Error {
fn from(err: Errno) -> Self {
Self::new(err, Context::Unknown)
}
}
impl From<Error> for Errno {
fn from(err: Error) -> Errno {
err.errno
}
}
impl From<Error> for std::io::Error {
fn from(err: Error) -> Self {
std::io::Error::from(err.errno)
}
}
impl From<[u8; 8]> for Error {
/// Deserializes an `Error` from bytes. Useful for receiving the error
/// through a pipe from the child process.
fn from(bytes: [u8; 8]) -> Self {
debug_assert_eq!(core::mem::size_of::<Self>(), 8);
unsafe { core::mem::transmute(bytes) }
}
}
impl From<Error> for [u8; 8] {
/// Serializes an `Error` into bytes. Useful for sending the error through a
/// pipe to the parent process.
fn from(error: Error) -> Self {
debug_assert_eq!(core::mem::size_of::<Self>(), 8);
unsafe { core::mem::transmute(error) }
}
}
pub(super) trait AddContext<T> {
fn context(self, context: Context) -> Result<T, Error>;
}
impl<T> AddContext<T> for Result<T, Errno> {
fn context(self, context: Context) -> Result<T, Error> {
self.map_err(move |errno| Error::new(errno, context))
}
}
impl<T> AddContext<T> for Result<T, nix::errno::Errno> {
fn context(self, context: Context) -> Result<T, Error> {
self.map_err(move |errno| Error::new(Errno::new(errno as i32), context))
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn to_bytes() {
let bytes: [u8; 8] = Error::new(Errno::ENOENT, Context::Exec).into();
assert_eq!(Error::from(bytes), Error::new(Errno::ENOENT, Context::Exec));
}
}

View file

@ -0,0 +1,331 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use std::os::unix::process::ExitStatusExt;
use nix::sys::signal::{self, SigHandler, SigSet, SigmaskHow, Signal};
/// Describes the result of a process after it has exited.
///
/// This is similar to `std::process::ExitStatus`, but is easier to match
/// against and provides additional functionality like `raise_or_exit` that
/// helps with propagating an exit status.
#[derive(Debug, Clone, Copy, Eq, PartialEq, Ord, PartialOrd)]
pub enum ExitStatus {
/// Program exited with an exit code.
Exited(i32),
/// Program killed by signal, with or without a coredump.
Signaled(Signal, bool),
}
impl ExitStatus {
/// A successful exit status.
pub const SUCCESS: Self = ExitStatus::Exited(0);
/// Construct an `ExitStatus` from a raw exit code.
pub fn from_raw(code: i32) -> Self {
if libc::WIFEXITED(code) {
ExitStatus::Exited(libc::WEXITSTATUS(code))
} else {
ExitStatus::Signaled(
Signal::try_from(libc::WTERMSIG(code)).unwrap(),
libc::WCOREDUMP(code),
)
}
}
/// Converts the exit status into a raw number.
pub fn into_raw(self) -> i32 {
match self {
ExitStatus::Exited(code) => code << 8,
ExitStatus::Signaled(sig, coredump) => {
if coredump {
(sig as i32 | 0x80) & 0xff
} else {
sig as i32 & 0x7f
}
}
}
}
/// If the process was terminated by a signal, returns that signal.
pub fn signal(&self) -> Option<i32> {
match self {
ExitStatus::Exited(_) => None,
ExitStatus::Signaled(sig, _) => Some(*sig as i32 & 0x7f),
}
}
/// Was termination successful? Signal termination is not considered a
/// success, and success is defined as a zero exit status.
pub fn success(&self) -> bool {
self == &ExitStatus::Exited(0)
}
/// Returns the exit code of the process, if any. If the process was
/// terminated by a signal, this will return `None`.
pub fn code(&self) -> Option<i32> {
if let ExitStatus::Exited(code) = *self {
Some(code)
} else {
None
}
}
/// Propagate the exit status such that the current process exits in the same
/// way that the child process exited.
pub fn raise_or_exit(self) -> ! {
match self {
ExitStatus::Signaled(signal, core_dump) => {
if core_dump {
// Prevent the current process from producing a core dump as
// well when the signal is propagated.
let limit = libc::rlimit {
rlim_cur: 0,
rlim_max: 0,
};
unsafe {
libc::setrlimit(libc::RLIMIT_CORE, &limit)
};
}
// Raise the same signal, which may or may not be fatal.
let _ = unsafe { signal::signal(signal, SigHandler::SigDfl) };
let _ = signal::raise(signal);
// Unblock the signal.
let mut mask = SigSet::empty();
mask.add(signal);
let _ = signal::sigprocmask(SigmaskHow::SIG_UNBLOCK, Some(&mask), None);
// Incase the signal is not fatal:
std::process::exit(signal as i32 + 128);
}
ExitStatus::Exited(code) => std::process::exit(code),
}
}
}
impl From<ExitStatus> for std::process::ExitStatus {
fn from(status: ExitStatus) -> Self {
Self::from_raw(status.into_raw())
}
}
impl From<std::process::ExitStatus> for ExitStatus {
fn from(status: std::process::ExitStatus) -> Self {
if let Some(sig) = status.signal() {
ExitStatus::Signaled(Signal::try_from(sig).unwrap(), true)
} else {
ExitStatus::Exited(status.code().unwrap_or(255))
}
}
}
impl serde::Serialize for ExitStatus {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::ser::Serializer,
{
serializer.serialize_i32(self.into_raw())
}
}
impl<'de> serde::Deserialize<'de> for ExitStatus {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::de::Deserializer<'de>,
{
let value = i32::deserialize(deserializer)?;
Ok(ExitStatus::from_raw(value))
}
}
#[cfg(all(test, not(sanitized)))]
mod tests_non_sanitized {
use super::*;
use nix::{
sys::{
signal::{self, Signal},
wait::{waitpid, WaitStatus},
},
unistd::{fork, ForkResult},
};
// Runs a closure in a forked process and reports the exit status.
fn run_forked<F>(f: F) -> nix::Result<ExitStatus>
where
F: FnOnce() -> nix::Result<()>,
{
match unsafe { fork() }? {
ForkResult::Parent { child, .. } => {
// Simply wait for the child to exit.
match waitpid(child, None)? {
WaitStatus::Exited(_, code) => Ok(ExitStatus::Exited(code)),
WaitStatus::Signaled(_, sig, coredump) => {
Ok(ExitStatus::Signaled(sig, coredump))
}
wait_status => unreachable!("Got unexpected wait status: {:?}", wait_status),
}
}
ForkResult::Child => {
// Suppress core dumps for testing purposes.
let limit = libc::rlimit {
rlim_cur: 0,
rlim_max: 0,
};
unsafe {
// restore some sighandlers to default
for &sig in &[libc::SIGALRM, libc::SIGINT, libc::SIGVTALRM] {
libc::signal(sig, libc::SIG_DFL);
}
// disable coredump
libc::setrlimit(libc::RLIMIT_CORE, &limit)
};
// Run the child.
let code = match f() {
Ok(()) => 0,
Err(err) => {
eprintln!("{}", err);
1
}
};
// The closure should have called `exit` by this point, but just
// in case it didn't, call it ourselves.
//
// Note: We also can't use the normal exit function here because we
// don't want to call atexit handlers since `execve` was never
// called.
unsafe {
::libc::_exit(code)
};
}
}
}
#[test]
fn normal_exit() {
assert_eq!(
run_forked(|| {
unsafe { libc::_exit(0) }
}),
Ok(ExitStatus::Exited(0))
);
assert_eq!(
run_forked(|| {
unsafe { libc::_exit(42) }
}),
Ok(ExitStatus::Exited(42))
);
// Thread exit
assert_eq!(
run_forked(|| {
unsafe {
libc::syscall(libc::SYS_exit, 42)
};
unreachable!();
}),
Ok(ExitStatus::Exited(42))
);
// exit_group. Should be identical to `libc::_exit`.
assert_eq!(
run_forked(|| {
unsafe {
libc::syscall(libc::SYS_exit_group, 42)
};
unreachable!();
}),
Ok(ExitStatus::Exited(42))
);
}
#[test]
fn exit_by_signal() {
assert_eq!(
run_forked(|| {
signal::raise(Signal::SIGALRM)?;
unreachable!();
}),
Ok(ExitStatus::Signaled(Signal::SIGALRM, false))
);
assert_eq!(
run_forked(|| {
signal::raise(Signal::SIGILL)?;
unreachable!();
}),
Ok(ExitStatus::Signaled(Signal::SIGILL, true))
);
}
#[test]
fn propagate_exit() {
// NOTE: These tests fail under a sanitized build. ASAN leak detection
// must be disabled for this to run correctly. To disable ASAN leak
// detection, set the `ASAN_OPTIONS=detect_leaks=0` environment variable
// *before* the test starts up. (This is currently done in the TARGETS
// file.) Alternatively, we *could* bypass the atexit handler that ASAN
// sets up by calling `libc::_exit`, but that may have unintended
// consequences for real code.
assert_eq!(
run_forked(|| { ExitStatus::Exited(0).raise_or_exit() }),
Ok(ExitStatus::Exited(0))
);
assert_eq!(
run_forked(|| { ExitStatus::Exited(42).raise_or_exit() }),
Ok(ExitStatus::Exited(42))
);
}
#[test]
fn propagate_signal() {
assert_eq!(
run_forked(|| { ExitStatus::Signaled(Signal::SIGILL, true).raise_or_exit() }),
Ok(ExitStatus::Signaled(Signal::SIGILL, true))
);
assert_eq!(
run_forked(|| { ExitStatus::Signaled(Signal::SIGALRM, false).raise_or_exit() }),
Ok(ExitStatus::Signaled(Signal::SIGALRM, false))
);
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn exit_code_into_raw() {
assert_eq!(ExitStatus::Exited(1).into_raw(), 0x1 << 8);
assert_eq!(
ExitStatus::Signaled(Signal::SIGINT, false).into_raw(),
Signal::SIGINT as i32
);
assert_eq!(
ExitStatus::Signaled(Signal::SIGILL, true).into_raw(),
0x80 | Signal::SIGILL as i32
);
assert_ne!(
ExitStatus::Exited(2).into_raw(),
ExitStatus::Signaled(Signal::SIGINT, false).into_raw()
);
}
#[test]
fn exit_status_from_raw() {
assert_eq!(ExitStatus::from_raw(0x100).code(), Some(1));
assert_eq!(ExitStatus::from_raw(0x100).signal(), None);
assert_eq!(ExitStatus::from_raw(0x84).code(), None);
assert_eq!(ExitStatus::from_raw(0x84).signal(), Some(4));
}
}

531
reverie-process/src/fd.rs Normal file
View file

@ -0,0 +1,531 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use super::util;
use core::pin::Pin;
use core::task::{Context, Poll};
use std::ffi::{CStr, CString};
use std::io::{self, Read, Write};
use std::os::unix::io::{AsRawFd, FromRawFd, IntoRawFd, RawFd};
use std::path::Path;
use syscalls::Errno;
use tokio::io::unix::AsyncFd as TokioAsyncFd;
use tokio::io::{AsyncRead, AsyncWrite, Interest, ReadBuf};
#[derive(Debug)]
// From `std/src/sys/unix/fd.rs`. Mark `-1` as an invalid file descriptor so it
// can be reused to in `Option<Fd>`.
#[rustc_layout_scalar_valid_range_start(0)]
#[rustc_layout_scalar_valid_range_end(0xFF_FF_FF_FE)]
pub struct Fd(i32);
/// An asynchronous file descriptor. The file descriptor is guaranteed to be in
/// non-blocking mode and implements `AsyncRead` and `AsyncWrite`.
#[derive(Debug)]
pub struct AsyncFd(TokioAsyncFd<Fd>);
impl Fd {
pub fn new(fd: i32) -> Self {
assert_ne!(fd, -1);
unsafe { Self(fd) }
}
#[allow(dead_code)]
pub fn open<P: AsRef<Path>>(path: P, flags: i32) -> Result<Self, Errno> {
let path = util::to_cstring(path.as_ref());
Self::open_c(path.as_ptr(), flags)
}
/// Opens a file from a NUL terminated string. This function does not
/// allocate.
pub fn open_c(path: *const libc::c_char, flags: i32) -> Result<Self, Errno> {
let fd = Errno::result(unsafe { libc::open(path, flags) })?;
Ok(unsafe { Self(fd) })
}
/// Creates a file from a NUL terminated string. This function does not allocate.
pub fn create_c(
path: *const libc::c_char,
flags: i32,
mode: libc::mode_t,
) -> Result<Self, Errno> {
let fd = Errno::result(unsafe { libc::open(path, flags | libc::O_CREAT, mode) })?;
Ok(unsafe { Self(fd) })
}
pub fn null(readable: bool) -> Result<Self, Errno> {
let path = unsafe { CStr::from_bytes_with_nul_unchecked(b"/dev/null\0") };
Self::open_c(
path.as_ptr(),
if readable {
libc::O_RDONLY
} else {
libc::O_WRONLY
},
)
}
/// Creates an endpoint for communications and returns a file descriptor that
/// refers to that endpoint.
pub fn socket(domain: i32, ty: i32, protocol: i32) -> Result<Self, Errno> {
Errno::result(unsafe { libc::socket(domain, ty, protocol) }).map(Self::new)
}
fn set_nonblocking(&self) -> Result<(), Errno> {
let fd = self.as_raw_fd();
let flags = Errno::result(unsafe { libc::fcntl(fd, libc::F_GETFL) })?;
Errno::result(unsafe { libc::fcntl(fd, libc::F_SETFL, flags | libc::O_NONBLOCK) })?;
Ok(())
}
/// Returns true if the file descriptor is nonblocking.
#[allow(unused)]
pub fn is_nonblocking(&self) -> Result<bool, Errno> {
let fd = self.as_raw_fd();
let flags = Errno::result(unsafe { libc::fcntl(fd, libc::F_GETFL) })?;
Ok(flags & libc::O_NONBLOCK == libc::O_NONBLOCK)
}
pub fn dup(&self) -> Result<Fd, Errno> {
let fd = Errno::result(unsafe { libc::dup(self.0) })?;
Ok(unsafe { Fd(fd) })
}
pub fn dup2(&self, newfd: RawFd) -> Result<Fd, Errno> {
let fd = Errno::result(unsafe { libc::dup2(self.0, newfd) })?;
Ok(unsafe { Fd(fd) })
}
#[allow(unused)]
pub fn close(self) -> Result<(), Errno> {
let fd = self.0;
core::mem::forget(self);
Errno::result(unsafe { libc::close(fd) })?;
Ok(())
}
/// Discards the file descriptor without closing it.
pub fn leave_open(self) {
core::mem::forget(self);
}
}
impl IntoRawFd for Fd {
fn into_raw_fd(self) -> RawFd {
let fd = self.as_raw_fd();
core::mem::forget(self);
fd
}
}
impl Drop for Fd {
fn drop(&mut self) {
let _ = unsafe { libc::close(self.0) };
}
}
impl Read for Fd {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
let res = Errno::result(unsafe {
libc::read(
self.0,
buf.as_mut_ptr() as *mut libc::c_void,
buf.len() as libc::size_t,
)
})?;
Ok(res as usize)
}
}
impl Write for Fd {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
let res = Errno::result(unsafe {
libc::write(
self.0,
buf.as_ptr() as *const libc::c_void,
buf.len() as libc::size_t,
)
})?;
Ok(res as usize)
}
fn flush(&mut self) -> io::Result<()> {
Ok(())
}
}
impl AsRawFd for Fd {
fn as_raw_fd(&self) -> RawFd {
self.0.as_raw_fd()
}
}
impl FromRawFd for Fd {
unsafe fn from_raw_fd(fd: i32) -> Self {
Self::new(fd)
}
}
impl From<Fd> for std::fs::File {
fn from(fd: Fd) -> Self {
unsafe { std::fs::File::from_raw_fd(fd.into_raw_fd()) }
}
}
impl AsyncFd {
pub fn new(fd: Fd) -> Result<Self, Errno> {
fd.set_nonblocking()?;
Ok(Self(
TokioAsyncFd::with_interest(fd, Interest::READABLE | Interest::WRITABLE).unwrap(),
))
}
pub fn readable(fd: Fd) -> Result<Self, Errno> {
fd.set_nonblocking()?;
Ok(Self(
TokioAsyncFd::with_interest(fd, Interest::READABLE).unwrap(),
))
}
pub fn writable(fd: Fd) -> Result<Self, Errno> {
fd.set_nonblocking()?;
Ok(Self(
TokioAsyncFd::with_interest(fd, Interest::WRITABLE).unwrap(),
))
}
}
impl AsRawFd for AsyncFd {
fn as_raw_fd(&self) -> RawFd {
self.0.as_raw_fd()
}
}
impl AsyncRead for AsyncFd {
fn poll_read(
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
buf: &mut ReadBuf<'_>,
) -> Poll<io::Result<()>> {
loop {
let mut guard = futures::ready!(self.0.poll_read_ready_mut(cx))?;
match guard.try_io(|inner| {
let n = inner.get_mut().read(buf.initialize_unfilled())?;
buf.advance(n);
Ok(())
}) {
Ok(result) => return Poll::Ready(result),
Err(_would_block) => continue,
}
}
}
}
impl AsyncWrite for AsyncFd {
fn poll_write(
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
buf: &[u8],
) -> Poll<io::Result<usize>> {
loop {
let mut guard = futures::ready!(self.0.poll_write_ready_mut(cx))?;
match guard.try_io(|inner| inner.get_mut().write(buf)) {
Ok(result) => return Poll::Ready(result),
Err(_would_block) => continue,
}
}
}
fn poll_flush(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<io::Result<()>> {
Poll::Ready(Ok(()))
}
fn poll_shutdown(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<io::Result<()>> {
Poll::Ready(Ok(()))
}
}
// Creates a unidirectional pipe. The writable end is second item and the
// readable end is the first item.
pub fn pipe() -> Result<(Fd, Fd), Errno> {
let mut fds = [0; 2];
// We use O_CLOEXEC because we don't want the pipe file descriptor to be
// inherited by child processes directly. Instead, we use `dup2` to assign
// it to one of the stdio file descriptors. Then, the duplicated file
// descriptor won't be closed upon exec.
Errno::result(unsafe { libc::pipe2(fds.as_mut_ptr(), libc::O_CLOEXEC) })?;
Ok((unsafe { Fd(fds[0]) }, unsafe { Fd(fds[1]) }))
}
/// Writes bytes to a file. The file path must be null terminated.
pub fn write_bytes(path: &'static [u8], bytes: &[u8]) -> Result<(), Errno> {
let path = unsafe { CStr::from_bytes_with_nul_unchecked(path) };
Fd::open_c(path.as_ptr(), libc::O_WRONLY)?
.write_all(bytes)
.map_err(|err| Errno::new(err.raw_os_error().unwrap()))
}
/// Creates a file if it does not exist.
pub fn touch(path: *const libc::c_char, mode: libc::mode_t) -> Result<(), Errno> {
Fd::create_c(path, libc::O_CLOEXEC, mode).map(drop)
}
pub fn lstat(path: *const libc::c_char) -> Result<libc::stat64, Errno> {
let mut buf: libc::stat64 = unsafe { core::mem::zeroed() };
Errno::result(unsafe { libc::lstat64(path, &mut buf) })?;
Ok(buf)
}
#[derive(Copy, Clone, Eq, PartialEq)]
pub struct FileType(libc::mode_t);
impl FileType {
pub fn new(path: *const libc::c_char) -> Result<Self, Errno> {
Ok(Self::from(lstat(path)?))
}
pub fn is_dir(&self) -> bool {
self.0 & libc::S_IFMT == libc::S_IFDIR
}
#[allow(unused)]
pub fn is_file(&self) -> bool {
self.0 & libc::S_IFMT == libc::S_IFREG
}
}
impl From<libc::stat64> for FileType {
fn from(stat: libc::stat64) -> Self {
Self(stat.st_mode)
}
}
/// Returns true if `path` is a directory. Returns `false` in all other cases.
///
/// NOTE: The `path` may exist and may be a directory, but this will still return
/// false if there is a permissions error. Use `FileType` to distinguish these
/// cases.
pub fn is_dir(path: *const libc::c_char) -> bool {
match FileType::new(path) {
Ok(ft) => ft.is_dir(),
Err(_) => false,
}
}
fn cstring_as_slice(s: &mut CString) -> &mut [libc::c_char] {
let bytes = s.as_bytes_with_nul();
unsafe {
// This is safe because we are already provided a mutable `CString` and
// we don't alias the two mutable references.
core::slice::from_raw_parts_mut(bytes.as_ptr() as *mut libc::c_char, bytes.len())
}
}
/// Creates every path component in `path` without allocating. This is done by
/// replacing each `/` with a NUL terminator as needed (and then changing the
/// `\0` back to `/` afterwards).
pub fn create_dir_all(path: &mut CString, mode: libc::mode_t) -> Result<(), Errno> {
create_dir_all_(cstring_as_slice(path), mode)
}
/// Helper function. The last character in the path is always `\0`.
fn create_dir_all_(path: &mut [libc::c_char], mode: libc::mode_t) -> Result<(), Errno> {
if path.len() == 1 {
return Ok(());
}
// Try creating this directory
match Errno::result(unsafe { libc::mkdir(path.as_ptr(), mode) }) {
Ok(_) => return Ok(()),
Err(Errno::ENOENT) => {}
Err(_) if is_dir(path.as_ptr()) => return Ok(()),
Err(e) => return Err(e),
}
// If it doesn't exist, try creating the parent directory.
with_parent(path, |parent| {
match parent {
Some(p) => create_dir_all_(p, mode),
None => {
// Got all the way to the root without successfully creating any
// child directories. Most likely a permissions error.
Err(Errno::EPERM)
}
}
})?;
// Finally, try creating the directory again after the parent directories
// now exist.
match Errno::result(unsafe { libc::mkdir(path.as_ptr(), mode) }) {
Ok(_) => Ok(()),
Err(_) if is_dir(path.as_ptr()) => Ok(()),
Err(e) => Err(e),
}
}
/// Creates an empty file at `path` without allocating.
pub fn touch_path(
path: &mut CString,
file_mode: libc::mode_t,
dir_mode: libc::mode_t,
) -> Result<(), Errno> {
touch_path_(cstring_as_slice(path), file_mode, dir_mode)
}
/// Helper function. The last character in the path is always `\0`.
fn touch_path_(
path: &mut [libc::c_char],
file_mode: libc::mode_t,
dir_mode: libc::mode_t,
) -> Result<(), Errno> {
// Try to create the file. This may fail if the parent directories do not exist.
match touch(path.as_ptr(), file_mode) {
Ok(_) => return Ok(()),
Err(Errno::ENOENT) => {}
Err(e) => return Err(e),
}
// Got ENOENT. Try to create the parent directories.
with_parent(path, |parent| match parent {
Some(p) => create_dir_all_(p, dir_mode),
None => Err(Errno::ENOENT),
})?;
// Try creating the file again after the parent directories now exist.
touch(path.as_ptr(), file_mode)
}
/// Helper function for chopping off the last path component, leaving only the
/// parent directory. To do this without allocating, the last path separator is
/// replaced with NUL before calling the closure. After the closure is done, the
/// NUL byte is replaced by the path component again. Thus, the path is only
/// mutated for the duration of the closure.
fn with_parent<F, T>(path: &mut [libc::c_char], mut f: F) -> T
where
F: FnMut(Option<&mut [libc::c_char]>) -> T,
{
// Find the index of one past the last path separator.
if let Some(parent_index) = path
.iter()
.rev()
.position(|c| *c == b'/' as i8)
.map(|i| path.len() - i)
{
// NB: the index is guaranteed to be >0.
path[parent_index - 1] = 0;
let result = f(Some(&mut path[..parent_index]));
// Restore the path to its former glory.
path[parent_index - 1] = b'/' as i8;
result
} else {
f(None)
}
}
#[cfg(test)]
mod tests {
use super::*;
use const_cstr::const_cstr;
use std::os::unix::ffi::OsStrExt;
#[test]
fn test_is_dir() {
assert!(is_dir(const_cstr!("/").as_ptr()));
assert!(is_dir(const_cstr!("/dev").as_ptr()));
assert!(!is_dir(const_cstr!("/dev/null").as_ptr()));
}
#[test]
fn test_file_type() {
assert!(FileType::new(const_cstr!("/").as_ptr()).unwrap().is_dir());
assert!(
FileType::new(const_cstr!("/dev").as_ptr())
.unwrap()
.is_dir()
);
assert!(
!FileType::new(const_cstr!("/dev/null").as_ptr())
.unwrap()
.is_file()
);
}
#[test]
fn test_create_dir_all() {
let tempdir = tempfile::TempDir::new().unwrap();
let mut path = CString::new(
tempdir
.path()
.join("some/path/to/a/dir")
.into_os_string()
.as_bytes(),
)
.unwrap();
let path2 = path.clone();
create_dir_all(&mut path, 0o777).unwrap();
assert_eq!(path, path2);
assert!(is_dir(path.as_ptr()));
}
#[test]
fn test_touch_path() {
let tempdir = tempfile::TempDir::new().unwrap();
let mut path = CString::new(
tempdir
.path()
.join("some/path/to/a/file")
.into_os_string()
.as_bytes(),
)
.unwrap();
let path2 = path.clone();
touch_path(&mut path, 0o666, 0o777).unwrap();
assert_eq!(path, path2);
assert!(FileType::new(path.as_ptr()).unwrap().is_file());
}
#[test]
fn test_nonblocking() -> Result<(), Errno> {
let (r, w) = pipe()?;
assert!(!r.is_nonblocking()?);
assert!(!w.is_nonblocking()?);
let f = w.dup()?;
assert!(!f.is_nonblocking()?);
w.set_nonblocking()?;
assert!(!r.is_nonblocking()?);
assert!(w.is_nonblocking()?);
assert!(f.is_nonblocking()?);
Ok(())
}
}

View file

@ -0,0 +1,17 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use std::io::Write;
pub fn make_id_map(map: &[(libc::uid_t, libc::uid_t, u32)]) -> Vec<u8> {
let mut v = Vec::new();
for (inside, outside, count) in map {
writeln!(v, "{} {} {}", inside, outside, count).unwrap();
}
v
}

610
reverie-process/src/lib.rs Normal file
View file

@ -0,0 +1,610 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
//! A drop-in replacement for `std::process::Command` that provides the ability
//! to set up namespaces, a seccomp filter, and more.
#![deny(missing_docs)]
#![deny(rustdoc::broken_intra_doc_links)]
#![feature(internal_output_capture)]
#![feature(never_type)]
#![feature(rustc_attrs)]
mod builder;
mod child;
mod clone;
mod container;
mod env;
mod error;
mod exit_status;
mod fd;
mod id_map;
mod mount;
mod namespace;
mod net;
mod pid;
mod pty;
pub mod seccomp;
mod spawn;
mod stdio;
mod util;
pub use child::Child;
pub use child::Output;
pub use container::Container;
pub use container::RunError;
pub use error::Context;
pub use error::Error;
pub use exit_status::ExitStatus;
pub use mount::Bind;
pub use mount::Mount;
pub use mount::MountFlags;
pub use mount::MountParseError;
pub use namespace::Namespace;
pub use pid::Pid;
pub use pty::Pty;
pub use pty::PtyChild;
pub use stdio::ChildStderr;
pub use stdio::ChildStdin;
pub use stdio::ChildStdout;
pub use stdio::Stdio;
// Re-export Signal since it is used by `Child::signal`.
pub use nix::sys::signal::Signal;
use std::ffi::CString;
use syscalls::Errno;
/// A builder for spawning a process.
// See the builder.rs for documentation of each field.
pub struct Command {
program: CString,
args: util::CStringArray,
pre_exec: Vec<Box<dyn FnMut() -> Result<(), Errno> + Send + Sync>>,
container: Container,
}
impl Command {
/// Converts [`std::process::Command`] into [`Command`]. Note that this is a
/// very basic and *lossy* conversion.
///
/// This only preserves the
/// - program path,
/// - arguments,
/// - environment variables,
/// - and working directory.
///
/// # Caveats
///
/// Since [`std::process::Command`] is rather opaque and doesn't provide
/// access to all fields, this will *not* preserve:
/// - stdio handles,
/// - `env_clear`,
/// - any `pre_exec` callbacks,
/// - `arg0` (if not the same as `program`),
/// - `uid`, `gid`, or `groups`.
pub fn from_std_lossy(cmd: &std::process::Command) -> Command {
let mut result = Command::new(cmd.get_program());
result.args(cmd.get_args());
for (key, value) in cmd.get_envs() {
match value {
Some(value) => result.env(key, value),
None => result.env_remove(key),
};
}
if let Some(dir) = cmd.get_current_dir() {
result.current_dir(dir);
}
result
}
/// This provides a *lossy* conversion to [`std::process::Command`]. The
/// features that are not supported by [`std::process::Command`] but *are*
/// supported by [`Command`] cannot be converted. For example, namespace and
/// mount configurations cannot be converted since they are not supported by
/// [`std::process::Command`].
pub fn into_std_lossy(self) -> std::process::Command {
let mut result = std::process::Command::new(self.get_program());
result.args(self.get_args());
if self.container.env.is_cleared() {
result.env_clear();
}
for (key, value) in self.get_envs() {
match value {
Some(value) => result.env(key, value),
None => result.env_remove(key),
};
}
if let Some(dir) = self.get_current_dir() {
result.current_dir(dir);
}
#[cfg(unix)]
{
use std::os::unix::process::CommandExt;
result.arg0(self.get_arg0());
for mut f in self.pre_exec {
unsafe {
result.pre_exec(move || f().map_err(Into::into));
}
}
}
result.stdin(self.container.stdin);
result.stdout(self.container.stdout);
result.stderr(self.container.stderr);
result
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::ExitStatus;
use std::collections::BTreeMap;
use std::fs;
use std::path::Path;
use std::str::from_utf8;
#[tokio::test]
async fn spawn() {
assert_eq!(
Command::new("true").spawn().unwrap().wait().await.unwrap(),
ExitStatus::Exited(0)
);
assert_eq!(
Command::new("false").spawn().unwrap().wait().await.unwrap(),
ExitStatus::Exited(1)
);
}
#[test]
fn wait_blocking() {
assert_eq!(
Command::new("true")
.spawn()
.unwrap()
.wait_blocking()
.unwrap(),
ExitStatus::Exited(0)
);
assert_eq!(
Command::new("false")
.spawn()
.unwrap()
.wait_blocking()
.unwrap(),
ExitStatus::Exited(1)
);
}
#[tokio::test]
async fn spawn_fail() {
assert_eq!(
Command::new("/iprobablydonotexist").spawn().unwrap_err(),
Error::new(Errno::ENOENT, Context::Exec)
);
}
#[tokio::test]
async fn double_wait() {
let mut child = Command::new("true").spawn().unwrap();
assert_eq!(child.wait().await.unwrap(), ExitStatus::Exited(0));
assert_eq!(child.wait().await.unwrap(), ExitStatus::Exited(0));
}
#[tokio::test]
async fn output() {
let output = Command::new("echo")
.arg("foo")
.arg("bar")
.output()
.await
.unwrap();
assert_eq!(output.stdout, b"foo bar\n");
assert_eq!(output.stderr, b"");
assert_eq!(output.status, ExitStatus::Exited(0));
}
fn parse_proc_status(stdout: &[u8]) -> BTreeMap<&str, &str> {
from_utf8(stdout)
.unwrap()
.trim_end()
.split('\n')
.map(|line| {
let mut items = line.splitn(2, ':');
let first = items.next().unwrap();
let second = items.next().unwrap();
(first, second.trim())
})
.collect()
}
#[tokio::test]
async fn uid_namespace() {
let output = Command::new("cat")
.arg("/proc/self/status")
.map_root()
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
let proc_status = parse_proc_status(&output.stdout);
// We should be root user inside of the container.
assert_eq!(proc_status["Uid"], "0\t0\t0\t0");
}
#[tokio::test]
async fn pid_namespace() {
let output = Command::new("cat")
.arg("/proc/self/status")
.map_root()
.unshare(Namespace::PID)
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
let proc_status = parse_proc_status(&output.stdout);
assert_eq!(proc_status["NSpid"].split('\t').nth(1), Some("1"),);
// Note that, since we haven't mounted a fresh /proc into the container,
// the child still sees what the parent sees and so the PID will *not*
// be 1.
assert_ne!(proc_status["Pid"], "1");
}
#[tokio::test]
async fn mount_proc() {
let output = Command::new("cat")
.arg("/proc/self/status")
.map_root()
.unshare(Namespace::PID)
.mount(Mount::proc())
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
let proc_status = parse_proc_status(&output.stdout);
// With /proc mounted, the child really believes it is the root process.
assert_eq!(proc_status["NSpid"], "1");
assert_eq!(proc_status["Pid"], "1");
}
#[tokio::test]
async fn hostname() {
let output = Command::new("cat")
.arg("/proc/sys/kernel/hostname")
.map_root()
.hostname("foobar.local")
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
let hostname = from_utf8(&output.stdout).unwrap().trim();
assert_eq!(hostname, "foobar.local");
}
#[tokio::test]
async fn domainname() {
let output = Command::new("cat")
.arg("/proc/sys/kernel/domainname")
.map_root()
.domainname("foobar")
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
let domainname = from_utf8(&output.stdout).unwrap().trim();
assert_eq!(domainname, "foobar");
}
#[tokio::test]
async fn pty() {
use tokio::io::AsyncReadExt;
let mut pty = Pty::open().unwrap();
let pty_child = pty.child().unwrap();
let mut tty = pty_child.terminal_params().unwrap();
// Prevent post-processing of output so `\n` isn't translated to `\r\n`.
tty.c_oflag &= !libc::OPOST;
pty_child.set_terminal_params(&tty).unwrap();
pty_child.set_window_size(40, 80).unwrap();
// stty is in coreutils and should be available on most systems.
let mut child = Command::new("stty")
.arg("size")
.pty(pty_child)
.spawn()
.unwrap();
// NOTE: read_to_end returns an EIO error once the child has exited.
let mut buf = Vec::new();
assert!(pty.read_to_end(&mut buf).await.is_err());
assert_eq!(from_utf8(&buf).unwrap(), "40 80\n");
assert_eq!(child.wait().await.unwrap(), ExitStatus::SUCCESS);
}
#[tokio::test]
async fn mount_devpts_basic() {
let output = Command::new("ls")
.arg("/dev/pts")
.map_root()
.mount(Mount::devpts("/dev/pts"))
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
// Should be totally empty except for `/dev/pts/ptmx` since we mounted a
// new devpts.
assert_eq!(output.stderr, b"");
assert_eq!(output.stdout, b"ptmx\n");
}
#[tokio::test]
async fn mount_devpts_isolated() {
let output = Command::new("ls")
.arg("/dev/pts")
.map_root()
.mount(Mount::devpts("/dev/pts").data("newinstance,ptmxmode=0666"))
.mount(Mount::bind("/dev/pts/ptmx", "/dev/ptmx"))
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
// Should be totally empty except for `/dev/pts/ptmx` since we mounted a
// new devpts.
assert_eq!(output.stderr, b"");
assert_eq!(output.stdout, b"ptmx\n");
}
#[tokio::test]
async fn mount_tmpfs() {
let output = Command::new("ls")
.arg("/tmp")
.map_root()
.mount(Mount::tmpfs("/tmp"))
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
// Should be totally empty since we mounted a new tmpfs.
assert_eq!(output.stderr, b"");
assert_eq!(output.stdout, b"");
}
#[tokio::test]
async fn mount_and_move_tmpfs() {
let tmpfs = tempfile::tempdir().unwrap();
// Create a temporary directory that will be the only thing to remain in
// the `/tmp` mount.
let persistent = tempfile::tempdir().unwrap();
fs::write(persistent.path().join("foobar"), b"").unwrap();
let output = Command::new("ls")
.arg("/tmp")
.map_root()
.mount(Mount::tmpfs(tmpfs.path()))
// Bind-mount a directory from our upper /tmp to our new /tmp.
.mount(Mount::bind(persistent.path(), &tmpfs.path().join("my-dir")).touch_target())
// Move our newly-created tmpfs to hide the upper /tmp folder.
.mount(Mount::rename(tmpfs.path(), Path::new("/tmp")))
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
// The only thing there should be our bind-mounted directory.
assert_eq!(output.stderr, b"");
assert_eq!(output.stdout, b"my-dir\n");
}
#[tokio::test]
async fn mount_bind() {
let temp = tempfile::tempdir().unwrap();
let a = temp.path().join("a");
let b = temp.path().join("b");
fs::create_dir(&a).unwrap();
fs::create_dir(&b).unwrap();
fs::write(a.join("foobar"), "im a test").unwrap();
let output = Command::new("ls")
.arg(&b)
.map_root()
.mount(Mount::bind(&a, &b))
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0));
assert_eq!(output.stdout, b"foobar\n");
assert_eq!(output.stderr, b"");
}
#[tokio::test]
async fn local_networking_ping() {
let output = Command::new("ping")
.arg("-c1")
.arg("::1")
.map_root()
.local_networking_only()
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0), "{:?}", output);
}
#[tokio::test]
async fn local_networking_loopback_flags() {
let output = Command::new("cat")
.arg("/sys/class/net/lo/flags")
.map_root()
.local_networking_only()
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0), "{:?}", output);
assert_eq!(output.stdout, b"0x9\n", "{:?}", output);
}
/// Show that processes in two separate network namespaces can bind to the
/// same port.
#[tokio::test]
async fn port_isolation() {
use std::thread::sleep;
use std::time::Duration;
let mut command = Command::new("nc");
command
.arg("-l")
.arg("127.0.0.1")
// Can bind to a low port without real root inside the namespace.
.arg("80")
.stdin(Stdio::null())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.map_root()
.local_networking_only();
let server1 = match command.spawn() {
// If netcat is not installed just exit successfully.
Err(error) if error.errno() == Errno::ENOENT => return,
other => other,
}
.unwrap();
let server2 = command.spawn().unwrap();
// Give them both time to start up.
sleep(Duration::from_millis(100));
// Signal them to shut down. Otherwise, they will wait forever for a
// connection that will never come.
server1.signal(Signal::SIGINT).unwrap();
server2.signal(Signal::SIGINT).unwrap();
let output1 = server1.wait_with_output().await.unwrap();
let output2 = server2.wait_with_output().await.unwrap();
// Without network isolation, one of the servers would exit with an
// "Address already in use" (exit status 2) error.
assert_eq!(
output1.status,
ExitStatus::Signaled(Signal::SIGINT, false),
"{:?}",
output1
);
assert_eq!(
output2.status,
ExitStatus::Signaled(Signal::SIGINT, false),
"{:?}",
output2
);
}
/// Make sure we can call `.local_networking_only` more than once.
#[tokio::test]
async fn local_networking_there_can_be_only_one() {
let output = Command::new("true")
.map_root()
.local_networking_only()
// If calling this twice mounted /sys twice, then we'd get a "Device
// or resource busy" error.
.local_networking_only()
.output()
.await
.unwrap();
assert_eq!(output.status, ExitStatus::Exited(0), "{:?}", output);
assert_eq!(output.stdout, b"", "{:?}", output);
assert_eq!(output.stderr, b"", "{:?}", output);
}
#[test]
fn from_std_lossy() {
let mut stdcmd = std::process::Command::new("echo");
stdcmd.args(["arg1", "arg2"]);
stdcmd.current_dir("/foo/bar");
stdcmd.env_clear();
stdcmd.env("FOO", "1");
stdcmd.env("BAR", "2");
let cmd = Command::from_std_lossy(&stdcmd);
assert_eq!(cmd.get_program(), "echo");
assert_eq!(cmd.get_arg0(), "echo");
assert_eq!(cmd.get_args().collect::<Vec<_>>(), ["arg1", "arg2"]);
let envs = cmd
.get_envs()
.filter_map(|(k, v)| Some((k.to_str()?, v.and_then(|v| v.to_str()))))
.collect::<Vec<_>>();
assert_eq!(envs, [("BAR", Some("2")), ("FOO", Some("1"))]);
}
#[test]
fn into_std_lossy() {
let mut cmd = Command::new("env");
cmd.args(["-0"]);
cmd.current_dir("/foo/bar");
cmd.env_clear();
cmd.env("FOO", "1");
cmd.env("BAR", "2");
let stdcmd = cmd.into_std_lossy();
assert_eq!(stdcmd.get_program(), "env");
assert_eq!(stdcmd.get_args().collect::<Vec<_>>(), ["-0"]);
let envs = stdcmd
.get_envs()
.filter_map(|(k, v)| Some((k.to_str()?, v.and_then(|v| v.to_str()))))
.collect::<Vec<_>>();
assert_eq!(envs, [("BAR", Some("2")), ("FOO", Some("1"))]);
}
}

View file

@ -0,0 +1,556 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use super::fd::{create_dir_all, touch_path, FileType};
use super::util;
use core::convert::Infallible;
use core::fmt;
use core::ptr;
use core::str::FromStr;
use std::collections::HashMap;
use std::ffi::{CString, OsStr};
use std::os::unix::ffi::OsStrExt;
use std::path::Path;
use syscalls::Errno;
pub use nix::mount::MsFlags as MountFlags;
/// A mount.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Mount {
source: Option<CString>,
target: CString,
fstype: Option<CString>,
flags: MountFlags,
data: Option<CString>,
touch_target: bool,
}
/// Represents a bind mount. Can be converted into a [`Mount`].
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Bind {
/// The source path of the bind mount. This path must exist. It can be either
/// a file or directory.
source: CString,
/// The target of the bind mount. This does not need to exist and can be
/// created when performing the bind mount.
target: CString,
}
impl Mount {
/// Creates a new mount at the path `target`.
pub fn new<S: AsRef<OsStr>>(target: S) -> Self {
Self {
source: None,
target: util::to_cstring(target),
fstype: None,
flags: MountFlags::empty(),
data: None,
touch_target: false,
}
}
/// Creates a bind mount. This effectively creates hardlink of a directory,
/// making the contents accessible at both places.
///
/// By default, none of the mounts in the `source` directory are visible in
/// `destination`. To make all mounts recursively visible, combine this with
/// [`Self::recursive`]. Can also be used with [`Self::readonly`] to make the
/// contents of `destination` read-only.
pub fn bind<S: AsRef<OsStr>, D: AsRef<OsStr>>(source: S, destination: D) -> Self {
Self::new(destination)
.source(source)
.flags(MountFlags::MS_BIND)
}
/// Move/rename a mount.
pub fn rename<S: AsRef<OsStr>, D: AsRef<OsStr>>(source: S, destination: D) -> Self {
Self::new(destination)
.source(source)
.flags(MountFlags::MS_MOVE)
}
/// Mount a fresh devpts file system. The target is usually `/dev/pts`.
///
/// In order for this devpts to be private and independent of other devpts
/// (i.e., for containers), use:
/// ```no_compile
/// Mount::devpts("/dev/pts").data("newinstance,ptmxmode=0666")
/// ```
/// And either make `/dev/ptmx` a symlink pointing to `/dev/pts/ptmx` or
/// bind-mount it.
///
/// See also: <https://www.kernel.org/doc/Documentation/filesystems/devpts.txt>
pub fn devpts<S: AsRef<OsStr>>(target: S) -> Self {
Self::new(target).fstype("devpts")
}
/// Mount a fresh proc file system at `/proc`.
pub fn proc() -> Self {
Self::new("/proc").fstype("proc")
}
/// Mount an overlay file system.
///
/// NOTE: This only works in Linux 5.11 or newer when mounted from a user
/// namespace. Otherwise, you need real root privileges to mount an
/// overlayfs.
///
/// An overlay filesystem combines two filesystems - an upper filesystem and
/// a lower filesystem. When a name exists in both filesystems, the object
/// in the upper filesystem is visible while the object in the lower
/// filesystem is either hidden or, in the case of directories, merged with
/// the upper object.
///
/// In other words, the `lowerdir` and `upperdir` are combined into a
/// directory `merged` using `workdir` as a temporary work area.
///
/// The lower filesystem can be any filesystem supported by Linux and does
/// not need to be writable. The lower filesystem can even be another
/// overlayfs. The upper filesystem should be writable.
///
/// See <https://www.kernel.org/doc/html/latest/filesystems/overlayfs.html> for
/// more information.
///
/// # Arguments
///
/// * `lowerdir` - The lower directory of the overlay. Can be any filesystem
/// and does not need to be writable. This directory is never
/// modified by writes to `merged`.
/// * `upperdir` - The upper directory of the overlay. This is where all
/// changes to `merged` are collected. Does not need to be
/// empty, but should be when starting a new overlay from
/// scratch.
/// * `workdir` - The work directory. This should always be empty.
/// * `merged` - The combination of `lowerdir` and `upperdir`.
pub fn overlay(lowerdir: &Path, upperdir: &Path, workdir: &Path, merged: &Path) -> Self {
// TODO: Since there can actually be multiple lowerdirs, it might be
// more ergonomic to return an `OverlayBuilder` instead.
let options = format!(
"lowerdir={},upperdir={},workdir={}",
lowerdir.display(),
upperdir.display(),
workdir.display()
);
Self::new(merged)
.fstype("overlay")
.source("overlay")
.data(options)
}
/// Creates a temporary file system at the location specified.
pub fn tmpfs<S: AsRef<OsStr>>(target: S) -> Self {
Self::new(target).fstype("tmpfs")
}
/// Creates a sys file system at the location specified. The target directory
/// is usually `/sys`. This is useful when creating a network namespace.
pub fn sysfs<S: AsRef<OsStr>>(target: S) -> Self {
Self::new(target).fstype("sysfs")
}
/// Sets the mount point target.
pub fn target<S: AsRef<OsStr>>(mut self, target: S) -> Self {
self.target = util::to_cstring(target);
self
}
/// Returns the mount point target path.
pub fn get_target(&self) -> &Path {
Path::new(OsStr::from_bytes(self.target.to_bytes()))
}
/// Sets the source of the mount.
pub fn source<S: AsRef<OsStr>>(mut self, path: S) -> Self {
self.source = Some(util::to_cstring(path));
self
}
/// Returns the mount point source path (if any).
pub fn get_source(&self) -> Option<&Path> {
self.source
.as_ref()
.map(|s| Path::new(OsStr::from_bytes(s.to_bytes())))
}
/// Indicates that the target of a bind mount should be created
/// automatically.
pub fn touch_target(mut self) -> Self {
self.touch_target = true;
self
}
/// Adds mount flags.
pub fn flags(mut self, flags: MountFlags) -> Self {
self.flags |= flags;
self
}
/// Make the file system read-only.
pub fn readonly(mut self) -> Self {
self.flags |= MountFlags::MS_RDONLY;
self
}
/// Makes a bind mount recursive.
pub fn recursive(mut self) -> Self {
self.flags |= MountFlags::MS_REC;
self
}
/// Makes this mount point private. Mount and unmount events do not propagate
/// into or out of this mount point.
pub fn private(mut self) -> Self {
self.flags |= MountFlags::MS_PRIVATE;
self
}
/// Make this mount point shared. Mount and unmount events immediately under
/// this mount point will propagate to the other mount points that are
/// members of this mount's peer group. Propagation here means that the same
/// mount or unmount will automatically occur under all of the other mount
/// points in the peer group. Conversely, mount and unmount events that take
/// place under peer mount points will propagate to this mount point.
pub fn shared(mut self) -> Self {
self.flags |= MountFlags::MS_SHARED;
self
}
/// Same as specifying both [`recursive`] and [`private`].
pub fn rprivate(mut self) -> Self {
self.flags |= MountFlags::MS_REC | MountFlags::MS_PRIVATE;
self
}
/// Same as specifying both [`recursive`] and [`shared`].
pub fn rshared(mut self) -> Self {
self.flags |= MountFlags::MS_REC | MountFlags::MS_SHARED;
self
}
/// Sets the filesystem type.
pub fn fstype<S: AsRef<OsStr>>(mut self, fstype: S) -> Self {
self.fstype = Some(util::to_cstring(fstype));
self
}
/// Sets any additional data required by the mount.
pub fn data<S: AsRef<OsStr>>(mut self, data: S) -> Self {
self.data = Some(util::to_cstring(data));
self
}
fn source_ptr(&self) -> *const libc::c_char {
self.source.as_ref().map_or(ptr::null(), |s| s.as_ptr())
}
fn target_ptr(&self) -> *const libc::c_char {
self.target.as_ptr()
}
fn fstype_ptr(&self) -> *const libc::c_char {
self.fstype.as_ref().map_or(ptr::null(), |s| s.as_ptr())
}
fn data_ptr(&self) -> *const libc::c_void {
self.data
.as_ref()
.map_or(ptr::null(), |s| s.as_ptr() as *const libc::c_void)
}
/// Performs the mount. For bind-mount operations, the target directory or
/// file is created if [`touch_target`] was used.
///
/// NOTE: This function *must* not allocate since it is called after `fork`
/// (or `clone`) and before `execve`. Any allocations could cause deadlocks
/// (which are hard to track down).
pub(super) fn mount(&mut self) -> Result<(), Errno> {
// NOTE: Although we can't allocate here, we can safely *modify* `self`.
// When this function is called, we have forked virtual memory and any
// modifications we make are copy-on-write and lost when `execve` is
// called. Thus, this function takes `self` by mutable reference.
if self.flags.contains(MountFlags::MS_BIND) && self.touch_target {
// Bind mounts will fail unless the destination path exists, so it
// is convenient to create it automatically.
//
// One reason for doing this here instead of the parent process is
// because the target may not yet exist until we mount it. For
// example, if we want to create a `/tmp` (tmpfs) folder and then
// bind-mount some files or directories into it, pre-creating the
// destination directories won't work because they'll get created in
// a different tmpfs.
if let Some(src) = &self.source {
if FileType::new(src.as_ptr())?.is_dir() {
create_dir_all(&mut self.target, 0o777)?;
} else {
touch_path(&mut self.target, 0o666, 0o777)?;
}
}
}
Errno::result(unsafe {
libc::mount(
self.source_ptr(),
self.target_ptr(),
self.fstype_ptr(),
self.flags.bits(),
self.data_ptr(),
)
})?;
Ok(())
}
}
impl Bind {
/// Creates a new bind mount. The `target` is optional because it is often
/// convenient to use an identical `source` and `target` directory. If
/// `target` is `None`, then it is interpretted as being the same as
/// `source`.
pub fn new<S, T>(source: S, target: T) -> Self
where
S: AsRef<OsStr>,
T: AsRef<OsStr>,
{
Self {
source: util::to_cstring(source),
target: util::to_cstring(target),
}
}
}
impl From<Bind> for Mount {
fn from(b: Bind) -> Self {
Self {
source: Some(b.source),
target: b.target,
fstype: None,
flags: MountFlags::MS_BIND,
data: None,
touch_target: false,
}
}
}
impl From<&str> for Bind {
fn from(s: &str) -> Self {
if let Some((source, target)) = s.split_once(':') {
Self {
source: util::to_cstring(source),
target: util::to_cstring(target),
}
} else {
let source = util::to_cstring(s);
let target = source.clone();
Self { source, target }
}
}
}
impl FromStr for Bind {
type Err = Infallible;
/// Parses bind mounts of the following forms:
/// 1. "path/to/source"
/// 2. "path/to/source:path/to/dest"
fn from_str(s: &str) -> Result<Self, Self::Err> {
Ok(Self::from(s))
}
}
/// An error from parsing a mount.
#[derive(thiserror::Error, Debug, Eq, PartialEq)]
pub enum MountParseError {
/// The `target` key is missing. This is always required.
MissingTarget,
/// An invalid mount option was specified.
Invalid(String, Option<String>),
}
impl fmt::Display for MountParseError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Self::MissingTarget => write!(f, "missing mount target"),
Self::Invalid(k, v) => match v {
Some(v) => write!(f, "invalid mount option '{}={}'", k, v),
None => write!(f, "invalid mount option '{}'", k),
},
}
}
}
impl FromStr for Mount {
type Err = MountParseError;
/// Parses a [`Mount`]. This accepts the same syntax as Docker mounts where
/// each mount consists of a comma-separated key-value list.
///
/// See https://docs.docker.com/storage/bind-mounts/ for more information.
fn from_str(s: &str) -> Result<Self, Self::Err> {
let mut map: HashMap<&str, Option<&str>> = HashMap::new();
for item in s.split(',') {
let item = item.trim();
if item.is_empty() {
continue;
}
let (key, value) = match item.split_once('=') {
Some((key, value)) => (key, Some(value)),
None => (item, None),
};
map.insert(key, value);
}
// The mount target is always required.
let mut mount = match map
.remove("target")
.or_else(|| map.remove("destination"))
.or_else(|| map.remove("dest"))
.or_else(|| map.remove("dst"))
.flatten()
{
Some(target) => Mount::new(target),
None => {
return Err(MountParseError::MissingTarget);
}
};
if let Some(source) = map.remove("source").or_else(|| map.remove("src")).flatten() {
mount = mount.source(source);
}
let is_bind_mount = if let Some(fstype) = map.remove("type").flatten() {
if fstype == "bind" {
true
} else {
mount = mount.fstype(fstype);
false
}
} else {
true
};
if is_bind_mount {
mount = mount.flags(MountFlags::MS_BIND);
}
if let Some((key, value)) = map.remove_entry("readonly") {
if let Some(value) = value {
// No value should have been specified.
return Err(MountParseError::Invalid(key.into(), Some(value.to_owned())));
}
mount = mount.readonly();
}
if let Some(propagation) = map.remove("bind-propagation").flatten() {
let flags = match propagation {
"shared" => MountFlags::MS_SHARED,
"slave" => MountFlags::MS_SLAVE,
"private" => MountFlags::MS_PRIVATE,
"rshared" => MountFlags::MS_REC | MountFlags::MS_SHARED,
"rslave" => MountFlags::MS_REC | MountFlags::MS_SLAVE,
"rprivate" => MountFlags::MS_REC | MountFlags::MS_PRIVATE,
_ => {
return Err(MountParseError::Invalid(
"bind-propagation".into(),
Some(propagation.into()),
));
}
};
mount = mount.flags(flags);
} else {
// All mounts get these flags by default.
mount = mount.flags(MountFlags::MS_REC | MountFlags::MS_PRIVATE);
}
// Any left over keys are invalid.
if let Some((k, v)) = map.into_iter().next() {
return Err(MountParseError::Invalid(k.into(), v.map(ToOwned::to_owned)));
}
Ok(mount)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn getters_and_setters() {
let m = Mount::bind("/foo", "/bar");
assert_eq!(m.get_target(), Path::new("/bar"));
assert_eq!(m.get_source(), Some(Path::new("/foo")));
let m = m.target("/baz");
assert_eq!(m.get_target(), Path::new("/baz"));
}
#[test]
fn parse_mount() {
assert_eq!(
Mount::from_str("type=bind,source=/foo,target=/bar,readonly"),
Ok(Mount::bind("/foo", "/bar").readonly().rprivate())
);
assert_eq!(
Mount::from_str("src=/foo,target=/bar,readonly"),
Ok(Mount::bind("/foo", "/bar").readonly().rprivate())
);
assert_eq!(
Mount::from_str("src=/foo,target=/bar,bind-propagation=rshared"),
Ok(Mount::bind("/foo", "/bar").rshared())
);
assert_eq!(
Mount::from_str("type=tmpfs,target=/tmp"),
Ok(Mount::tmpfs("/tmp").rprivate())
);
assert_eq!(
Mount::from_str("target=foo, ,,,"),
Ok(Mount::new("foo").flags(MountFlags::MS_BIND).rprivate())
);
assert_eq!(Mount::from_str(""), Err(MountParseError::MissingTarget));
assert_eq!(
Mount::from_str("type=bind,source=/foo,readonly"),
Err(MountParseError::MissingTarget)
);
assert_eq!(
Mount::from_str("type=tmpfs,target=/foo,wat"),
Err(MountParseError::Invalid("wat".into(), None))
);
assert_eq!(
Mount::from_str("type=tmpfs,target=/foo,readonly=wat"),
Err(MountParseError::Invalid(
"readonly".into(),
Some("wat".into())
))
);
}
#[test]
fn parse_bind() {
assert_eq!(Bind::from("source:target"), Bind::new("source", "target"));
assert_eq!(Bind::from("source"), Bind::new("source", "source"));
assert_eq!(
Mount::from(Bind::from("source:target")),
Mount::bind("source", "target")
);
}
}

View file

@ -0,0 +1,74 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use serde::{Deserialize, Serialize};
use std::str::FromStr;
bitflags::bitflags! {
/// A namespace that may be unshared with [`Command::unshare`].
///
/// [`Command::unshare`]: super::Command::unshare
#[derive(Deserialize, Serialize)]
pub struct Namespace: i32 {
/// Cgroup namespace.
const CGROUP = libc::CLONE_NEWCGROUP;
/// IPC namespace.
const IPC = libc::CLONE_NEWIPC;
/// Network namespace.
const NETWORK = libc::CLONE_NEWNET;
/// Mount namespace.
const MOUNT = libc::CLONE_NEWNS;
/// PID namespace.
const PID = libc::CLONE_NEWPID;
/// User and group namespace.
const USER = libc::CLONE_NEWUSER;
/// UTS namespace.
const UTS = libc::CLONE_NEWUTS;
}
}
impl Default for Namespace {
fn default() -> Self {
Self::empty()
}
}
#[derive(Debug, Clone)]
pub enum ParseNamespaceError {
InvalidNamespace(String),
}
impl std::error::Error for ParseNamespaceError {}
impl core::fmt::Display for ParseNamespaceError {
fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
match self {
ParseNamespaceError::InvalidNamespace(ns) => {
write!(f, "Invalid namespace: {}", ns)
}
}
}
}
impl FromStr for Namespace {
type Err = ParseNamespaceError;
fn from_str(s: &str) -> Result<Self, ParseNamespaceError> {
s.split(',').try_fold(Namespace::empty(), |ns, s| match s {
"cgroup" => Ok(ns | Namespace::CGROUP),
"ipc" => Ok(ns | Namespace::IPC),
"network" => Ok(ns | Namespace::NETWORK),
"pid" => Ok(ns | Namespace::PID),
"mount" => Ok(ns | Namespace::MOUNT),
"user" => Ok(ns | Namespace::USER),
"uts" => Ok(ns | Namespace::UTS),
"" | "none" => Ok(ns),
invalid_ns => Err(ParseNamespaceError::InvalidNamespace(invalid_ns.to_owned())),
})
}
}

220
reverie-process/src/net.rs Normal file
View file

@ -0,0 +1,220 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use super::fd::Fd;
use std::ffi::CStr;
use std::ffi::OsStr;
use std::mem::MaybeUninit;
use std::os::unix::io::AsRawFd;
use syscalls::Errno;
/// Interface name.
#[derive(Default, Debug, Copy, Clone, Eq, PartialEq)]
#[repr(C)]
pub struct IfName([u8; libc::IFNAMSIZ]);
/// A network interface request.
#[derive(Debug, Copy, Clone)]
#[repr(C)]
struct IfReq<T> {
/// The interface name.
name: IfName,
/// The request type.
///
/// NOTE: The kernel's `if_req` struct is made up of a `union` of all
/// possible request types. Thus, the size of `if_req` is not necessarily the
/// same as the size of `IfReq<T>`. However, there is no danger of a buffer
/// overrun, since the kernel does not write to the unused parts of the union
/// when handling the associated ioctls.
req: T,
}
#[derive(Debug, Copy, Clone)]
#[repr(C)]
pub struct ifmap {
pub mem_start: usize,
pub mem_end: usize,
pub base_addr: u16,
pub irq: u8,
pub dma: u8,
pub port: u8,
/* 3 bytes spare */
}
impl IfName {
// Many unused functions here. The full set of ioctl's are implemented for
// `if_req`, but we don't need them yet.
#![allow(unused)]
/// The name of the loopback interface.
pub const LOOPBACK: Self = Self(*b"lo\0\0\0\0\0\0\0\0\0\0\0\0\0\0");
pub fn new<S: AsRef<OsStr>>(name: S) -> Result<Self, InterfaceNameTooLong> {
use std::os::unix::ffi::OsStrExt;
let name = name.as_ref().as_bytes();
if name.len() + 1 > libc::IFNAMSIZ {
Err(InterfaceNameTooLong)
} else {
let mut arr = [0u8; libc::IFNAMSIZ];
arr[..name.len()].copy_from_slice(name);
arr[name.len()] = 0;
Ok(Self(arr))
}
}
fn ioctl_get<T>(self, ioctl: libc::c_ulong, socket: &Fd) -> Result<T, Errno> {
let mut req = IfReq::new(self, MaybeUninit::uninit());
Errno::result(unsafe { libc::ioctl(socket.as_raw_fd(), ioctl, &mut req as *mut _) })?;
Ok(unsafe { req.into_req().assume_init() })
}
fn ioctl_set<T>(self, ioctl: libc::c_ulong, socket: &Fd, value: T) -> Result<(), Errno> {
let req = IfReq::new(self, value);
Errno::result(unsafe { libc::ioctl(socket.as_raw_fd(), ioctl, &req as *const _) })?;
Ok(())
}
pub fn get_addr(&self, socket: &Fd) -> Result<libc::sockaddr, Errno> {
self.ioctl_get(libc::SIOCGIFADDR, socket)
}
pub fn set_addr(&self, socket: &Fd, addr: libc::sockaddr) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFADDR, socket, addr)
}
pub fn get_dest_addr(&self, socket: &Fd) -> Result<libc::sockaddr, Errno> {
self.ioctl_get(libc::SIOCGIFDSTADDR, socket)
}
pub fn set_dest_addr(&self, socket: &Fd, addr: libc::sockaddr) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFDSTADDR, socket, addr)
}
pub fn get_broadcast_addr(&self, socket: &Fd) -> Result<libc::sockaddr, Errno> {
self.ioctl_get(libc::SIOCGIFBRDADDR, socket)
}
pub fn set_broadcast_addr(&self, socket: &Fd, addr: libc::sockaddr) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFBRDADDR, socket, addr)
}
pub fn get_netmask(&self, socket: &Fd) -> Result<libc::sockaddr, Errno> {
self.ioctl_get(libc::SIOCGIFNETMASK, socket)
}
pub fn set_netmask(&self, socket: &Fd, addr: libc::sockaddr) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFNETMASK, socket, addr)
}
pub fn get_hw_addr(&self, socket: &Fd) -> Result<libc::sockaddr, Errno> {
self.ioctl_get(libc::SIOCGIFHWADDR, socket)
}
pub fn set_hw_addr(&self, socket: &Fd, addr: libc::sockaddr) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFHWADDR, socket, addr)
}
pub fn get_flags(&self, socket: &Fd) -> Result<i16, Errno> {
self.ioctl_get(libc::SIOCGIFFLAGS, socket)
}
pub fn set_flags(&self, socket: &Fd, flags: i16) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFFLAGS, socket, flags)
}
pub fn get_metric(&self, socket: &Fd) -> Result<i32, Errno> {
self.ioctl_get(libc::SIOCGIFMETRIC, socket)
}
pub fn set_metric(&self, socket: &Fd, value: i32) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFMETRIC, socket, value)
}
pub fn get_mtu(&self, socket: &Fd) -> Result<i32, Errno> {
self.ioctl_get(libc::SIOCGIFMTU, socket)
}
pub fn set_mtu(&self, socket: &Fd, value: i32) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFMTU, socket, value)
}
/// Gets the device map.
pub fn get_map(&self, socket: &Fd) -> Result<ifmap, Errno> {
self.ioctl_get(libc::SIOCGIFMAP, socket)
}
/// Sets the device map.
pub fn set_map(&self, socket: &Fd, map: ifmap) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFMAP, socket, map)
}
/// Gets the slave device.
pub fn get_slave(&self, socket: &Fd) -> Result<Self, Errno> {
self.ioctl_get(libc::SIOCGIFSLAVE, socket)
}
/// Sets the slave device.
pub fn set_slave(&self, socket: &Fd, name: Self) -> Result<(), Errno> {
self.ioctl_set(libc::SIOCSIFSLAVE, socket, name)
}
}
impl AsRef<CStr> for IfName {
fn as_ref(&self) -> &CStr {
unsafe { CStr::from_ptr(self.0.as_ptr() as *const _) }
}
}
/// An error indicating that the interface name is too long.
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub struct InterfaceNameTooLong;
impl<T> IfReq<T> {
/// Creates a new interface request.
pub fn new(name: IfName, req: T) -> Self {
Self { name, req }
}
pub fn into_req(self) -> T {
self.req
}
}
#[cfg(test)]
mod tests {
use super::*;
use nix::net::if_::InterfaceFlags;
#[test]
fn ifname() {
assert_eq!(IfName::new("lo"), Ok(IfName::LOOPBACK));
assert_eq!(
IfName::new("too loooooooooooooooong"),
Err(InterfaceNameTooLong)
);
}
#[test]
fn smoke_tests() {
let sock = Fd::socket(libc::AF_INET, libc::SOCK_DGRAM, libc::IPPROTO_IP).unwrap();
let lo = IfName::LOOPBACK;
let addr = lo.get_addr(&sock).unwrap();
assert_eq!(addr.sa_family as i32, libc::AF_INET);
let flags = InterfaceFlags::from_bits_truncate(lo.get_flags(&sock).unwrap() as i32);
assert!(flags.contains(InterfaceFlags::IFF_LOOPBACK));
}
}

132
reverie-process/src/pid.rs Normal file
View file

@ -0,0 +1,132 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use core::fmt;
use core::hash::Hash;
use serde::{Deserialize, Serialize};
/// A process ID (PID).
#[derive(
Copy,
Clone,
Debug,
Eq,
PartialEq,
Ord,
PartialOrd,
Hash,
Serialize,
Deserialize
)]
pub struct Pid(libc::pid_t);
impl Pid {
/// Creates `Pid` from a raw `pid_t`.
pub fn from_raw(pid: libc::pid_t) -> Self {
Self(pid)
}
/// Returns the PID of the calling process.
pub fn this() -> Self {
nix::unistd::Pid::this().into()
}
/// Returns the PID of the calling process.
pub fn parent() -> Self {
nix::unistd::Pid::parent().into()
}
/// Gets the raw `pid_t` from this `Pid`.
pub fn as_raw(self) -> libc::pid_t {
self.0
}
/// Returns a `Display`able that is color-coded. That is, the same PID will
/// get the same color. This makes it easy to visually recognize PIDs when
/// looking through logs.
///
/// Note that while the same PIDs always have the same color, different PIDs
/// may also have the same color if they fall into the same color bucket.
pub fn colored(self) -> ColoredPid {
ColoredPid(self)
}
}
impl From<nix::unistd::Pid> for Pid {
fn from(pid: nix::unistd::Pid) -> Pid {
Self(pid.as_raw())
}
}
impl From<Pid> for nix::unistd::Pid {
fn from(pid: Pid) -> nix::unistd::Pid {
nix::unistd::Pid::from_raw(pid.as_raw())
}
}
impl From<Pid> for libc::pid_t {
fn from(pid: Pid) -> libc::pid_t {
pid.as_raw()
}
}
impl From<libc::pid_t> for Pid {
fn from(pid: libc::pid_t) -> Pid {
Pid::from_raw(pid)
}
}
impl fmt::Display for Pid {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
fmt::Display::fmt(&self.0, f)
}
}
/// A colored pid.
pub struct ColoredPid(Pid);
impl ColoredPid {
/// Gets the ansi color code for the current PID. Returns `None` if not
/// writing to a terminal.
fn ansi_code(&self) -> Option<&'static str> {
if colored::control::SHOULD_COLORIZE.should_colorize() {
// Why not just use `colored::Colorize` you ask? It allocates a
// string in order to create the color code. Since we may log a lot
// of output that may contain a lot of PIDs, we don't want that to
// slow us down.
Some(match self.0.as_raw() % 14 {
0 => "\x1b[0;31m", // Red
1 => "\x1b[0;32m", // Green
2 => "\x1b[0;33m", // Yellow
3 => "\x1b[0;34m", // Blue
4 => "\x1b[0;35m", // Magenta
5 => "\x1b[0;36m", // Cyan
6 => "\x1b[0;37m", // White
7 => "\x1b[1;31m", // Bright red
8 => "\x1b[1;32m", // Bright green
9 => "\x1b[01;33m", // Bright yellow
10 => "\x1b[1;34m", // Bright blue
11 => "\x1b[1;35m", // Bright magenta
12 => "\x1b[1;36m", // Bright cyan
_ => "\x1b[1;37m", // Bright white
})
} else {
None
}
}
}
impl fmt::Display for ColoredPid {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
if let Some(color) = self.ansi_code() {
write!(f, "{}{}\x1b[0m", color, self.0)
} else {
fmt::Display::fmt(&self.0, f)
}
}
}

203
reverie-process/src/pty.rs Normal file
View file

@ -0,0 +1,203 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use super::fd::{AsyncFd, Fd};
use syscalls::Errno;
use tokio::io::{AsyncRead, AsyncWrite, ReadBuf};
use core::mem::MaybeUninit;
use core::pin::Pin;
use core::task::{Context, Poll};
use std::io;
use std::os::unix::io::AsRawFd;
use std::os::unix::io::IntoRawFd;
use std::os::unix::io::RawFd;
/// Represents a pseudo-TTY "master".
#[derive(Debug)]
pub struct Pty {
fd: AsyncFd,
}
impl Pty {
/// Opens a new pseudo-TTY master.
///
/// NOTE: As long as there is a handle open to at least one child pty, reads
/// will not reach EOF and will continue to return `EWOULDBLOCK`.
pub fn open() -> Result<Self, Errno> {
let fd = Fd::new(Errno::result(unsafe {
libc::posix_openpt(libc::O_RDWR | libc::O_NOCTTY)
})?);
Errno::result(unsafe { libc::grantpt(fd.as_raw_fd()) })?;
Errno::result(unsafe { libc::unlockpt(fd.as_raw_fd()) })?;
let fd = AsyncFd::new(fd)?;
Ok(Self { fd })
}
/// Opens a pseudo-TTY slave that is connected to this master.
pub fn child(&self) -> Result<PtyChild, Errno> {
const TIOCGPTPEER: libc::c_ulong = 0x5441;
let parent = self.fd.as_raw_fd();
let fd = Errno::result(unsafe {
// NOTE: This ioctl isn't supported until Linux v4.13 (see
// `ioctl_tty(2)`), so we may fallback to path-based slave fd
// allocation.
libc::ioctl(parent, TIOCGPTPEER, libc::O_RDWR | libc::O_NOCTTY)
})
.map(Fd::new)
.or_else(|_err| {
let mut path: [libc::c_char; libc::PATH_MAX as usize] = [0; libc::PATH_MAX as usize];
Errno::result(unsafe { libc::ptsname_r(parent, path.as_mut_ptr(), path.len()) })?;
Fd::open_c(path.as_ptr(), libc::O_RDWR | libc::O_NOCTTY)
})?;
Ok(PtyChild { fd })
}
}
/// A pseudo-TTY child (or "slave" in TTY parlance). This is passed to child
/// processes.
#[derive(Debug)]
pub struct PtyChild {
fd: Fd,
}
impl PtyChild {
/// Sets the pseudo-TTY child as the controlling terminal for the current
/// process.
///
/// Specifically, this does several things:
/// 1. Calls setsid to create a new session.
/// 2. Makes this fd the controlling terminal of this process by running the
/// correct ioctl.
/// 3. Calls `dup2` to set each stdio stream to redirect to this fd.
/// 4. Closes the fd.
pub fn login(self) -> Result<(), Errno> {
Errno::result(unsafe { libc::login_tty(self.fd.into_raw_fd()) })?;
Ok(())
}
/// Sets the window size in rows and columns.
pub fn set_window_size(&self, rows: u16, cols: u16) -> Result<(), Errno> {
let fd = self.fd.as_raw_fd();
let winsize = libc::winsize {
ws_row: rows,
ws_col: cols,
ws_xpixel: 0,
ws_ypixel: 0,
};
Errno::result(unsafe { libc::ioctl(fd, libc::TIOCSWINSZ, &winsize as *const _) })?;
Ok(())
}
/// Returns the window size in terms of rows and columns.
pub fn window_size(&self) -> Result<(u16, u16), Errno> {
let fd = self.fd.as_raw_fd();
let mut winsize = MaybeUninit::<libc::winsize>::uninit();
Errno::result(unsafe { libc::ioctl(fd, libc::TIOCGWINSZ, winsize.as_mut_ptr()) })?;
let winsize = unsafe { winsize.assume_init() };
Ok((winsize.ws_row, winsize.ws_col))
}
/// Sets the terminal parameters.
pub fn set_terminal_params(&self, params: &libc::termios) -> Result<(), Errno> {
let fd = self.fd.as_raw_fd();
Errno::result(unsafe { libc::tcsetattr(fd, libc::TCSAFLUSH, params as *const _) })?;
Ok(())
}
/// Gets the terminal parameters.
pub fn terminal_params(&self) -> Result<libc::termios, Errno> {
let fd = self.fd.as_raw_fd();
let mut term = MaybeUninit::<libc::termios>::uninit();
Errno::result(unsafe { libc::tcgetattr(fd, term.as_mut_ptr()) })?;
Ok(unsafe { term.assume_init() })
}
}
impl AsRawFd for Pty {
fn as_raw_fd(&self) -> RawFd {
self.fd.as_raw_fd()
}
}
impl AsRawFd for PtyChild {
fn as_raw_fd(&self) -> RawFd {
self.fd.as_raw_fd()
}
}
impl AsyncWrite for Pty {
fn poll_write(
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
buf: &[u8],
) -> Poll<tokio::io::Result<usize>> {
Pin::new(&mut self.fd).poll_write(cx, buf)
}
fn poll_flush(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<io::Result<()>> {
Pin::new(&mut self.fd).poll_flush(cx)
}
fn poll_shutdown(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<io::Result<()>> {
Pin::new(&mut self.fd).poll_shutdown(cx)
}
}
impl AsyncRead for Pty {
fn poll_read(
mut self: Pin<&mut Self>,
cx: &mut Context,
buf: &mut ReadBuf,
) -> Poll<tokio::io::Result<()>> {
Pin::new(&mut self.fd).poll_read(cx, buf)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[tokio::test]
async fn test_open() {
let pty = Pty::open().unwrap();
let child1 = pty.child().unwrap();
child1.set_window_size(20, 40).unwrap();
assert_eq!(child1.window_size().unwrap(), (20, 40));
let child2 = pty.child().unwrap();
child2.set_window_size(40, 80).unwrap();
assert_eq!(child2.window_size().unwrap(), (40, 80));
// Since they're both connected to the same master, changing the window
// size of one child affects both of them.
assert_eq!(child1.window_size().unwrap(), (40, 80));
}
}

View file

@ -0,0 +1,440 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
#![allow(non_snake_case)]
use syscalls::Errno;
use syscalls::Sysno;
pub use libc::sock_filter;
// See: /include/uapi/linux/bpf_common.h
// Instruction classes
pub const BPF_LD: u16 = 0x00;
pub const BPF_ST: u16 = 0x02;
pub const BPF_JMP: u16 = 0x05;
pub const BPF_RET: u16 = 0x06;
// ld/ldx fields
pub const BPF_W: u16 = 0x00;
pub const BPF_ABS: u16 = 0x20;
pub const BPF_MEM: u16 = 0x60;
pub const BPF_JEQ: u16 = 0x10;
pub const BPF_JGT: u16 = 0x20;
pub const BPF_JGE: u16 = 0x30;
pub const BPF_K: u16 = 0x00;
/// Maximum number of instructions.
pub const BPF_MAXINSNS: usize = 4096;
/// Defined in `/include/uapi/linux/seccomp.h`.
const SECCOMP_SET_MODE_FILTER: u32 = 1;
/// Offset of `seccomp_data::nr` in bytes.
const SECCOMP_DATA_OFFSET_NR: u32 = 0;
/// Offset of `seccomp_data::arch` in bytes.
const SECCOMP_DATA_OFFSET_ARCH: u32 = 4;
/// Offset of `seccomp_data::instruction_pointer` in bytes.
const SECCOMP_DATA_OFFSET_IP: u32 = 8;
/// Offset of `seccomp_data::args` in bytes.
#[allow(unused)]
const SECCOMP_DATA_OFFSET_ARGS: u32 = 16;
#[cfg(target_endian = "little")]
const SECCOMP_DATA_OFFSET_IP_HI: u32 = SECCOMP_DATA_OFFSET_IP + 4;
#[cfg(target_endian = "little")]
const SECCOMP_DATA_OFFSET_IP_LO: u32 = SECCOMP_DATA_OFFSET_IP;
#[cfg(target_endian = "big")]
const SECCOMP_DATA_OFFSET_IP_HI: u32 = SECCOMP_DATA_OFFSET_IP;
#[cfg(target_endian = "big")]
const SECCOMP_DATA_OFFSET_IP_LO: u32 = SECCOMP_DATA_OFFSET_IP + 4;
// These are defined in `/include/uapi/linux/elf-em.h`.
const EM_386: u32 = 3;
const EM_MIPS: u32 = 8;
const EM_PPC: u32 = 20;
const EM_PPC64: u32 = 21;
const EM_ARM: u32 = 40;
const EM_X86_64: u32 = 62;
const EM_AARCH64: u32 = 183;
// These are defined in `/include/uapi/linux/audit.h`.
const __AUDIT_ARCH_64BIT: u32 = 0x8000_0000;
const __AUDIT_ARCH_LE: u32 = 0x4000_0000;
// These are defined in `/include/uapi/linux/audit.h`.
pub const AUDIT_ARCH_X86: u32 = EM_386 | __AUDIT_ARCH_LE;
pub const AUDIT_ARCH_X86_64: u32 = EM_X86_64 | __AUDIT_ARCH_64BIT | __AUDIT_ARCH_LE;
pub const AUDIT_ARCH_ARM: u32 = EM_ARM | __AUDIT_ARCH_LE;
pub const AUDIT_ARCH_AARCH64: u32 = EM_AARCH64 | __AUDIT_ARCH_64BIT | __AUDIT_ARCH_LE;
pub const AUDIT_ARCH_MIPS: u32 = EM_MIPS;
pub const AUDIT_ARCH_PPC: u32 = EM_PPC;
pub const AUDIT_ARCH_PPC64: u32 = EM_PPC64 | __AUDIT_ARCH_64BIT;
/// Seccomp-BPF program byte code.
#[derive(Debug, Clone, Eq, PartialEq)]
pub struct Filter {
// Since the limit is 4096 instructions, we *could* use a static array here
// instead. However, that would require bounds checks each time an
// instruction is appended and complicate the interface with `Result` types
// and error handling logic. It's cleaner to just check the size when the
// program is loaded.
filter: Vec<sock_filter>,
}
impl Filter {
/// Creates a new, empty seccomp program. Note that empty BPF programs are not
/// valid and will fail to load.
pub const fn new() -> Self {
Self { filter: Vec::new() }
}
/// Appends a single instruction to the seccomp-BPF program.
pub fn push(&mut self, instruction: sock_filter) {
self.filter.push(instruction);
}
/// Returns the number of instructions in the BPF program.
pub fn len(&self) -> usize {
self.filter.len()
}
/// Returns true if the program is empty. Empty seccomp filters will result
/// in an error when loaded.
pub fn is_empty(&self) -> bool {
self.filter.is_empty()
}
/// Loads the program via seccomp into the current process.
///
/// Once loaded, the seccomp filter can never be removed. Additional seccomp
/// filters can be loaded, however, and they will chain together and be
/// executed in reverse order.
///
/// NOTE: The maximum size of any single seccomp-bpf filter is 4096
/// instructions. The overall limit is 32768 instructions across all loaded
/// filters.
///
/// See [`seccomp(2)`](https://man7.org/linux/man-pages/man2/seccomp.2.html)
/// for more details.
pub fn load(&self) -> Result<(), Errno> {
let len = self.filter.len();
if len == 0 || len > BPF_MAXINSNS {
return Err(Errno::EINVAL);
}
let prog = libc::sock_fprog {
// Note: length is guaranteed to be less than `u16::MAX` because of
// the above check.
len: len as u16,
filter: self.filter.as_ptr() as *mut _,
};
let ptr = &prog as *const libc::sock_fprog;
Errno::result(unsafe {
libc::syscall(libc::SYS_seccomp, SECCOMP_SET_MODE_FILTER, 0, ptr)
})?;
Ok(())
}
}
impl Extend<sock_filter> for Filter {
fn extend<T: IntoIterator<Item = sock_filter>>(&mut self, iter: T) {
self.filter.extend(iter)
}
}
/// Trait for types that can emit BPF byte code.
pub trait ByteCode {
/// Accumulates BPF instructions into the given filter.
fn into_bpf(self, filter: &mut Filter);
}
impl<F> ByteCode for F
where
F: FnOnce(&mut Filter),
{
fn into_bpf(self, filter: &mut Filter) {
self(filter)
}
}
impl ByteCode for sock_filter {
fn into_bpf(self, filter: &mut Filter) {
filter.push(self)
}
}
/// Returns a seccomp-bpf filter containing the given list of instructions.
///
/// This can be concatenated with other seccomp-BPF programs.
///
/// Note that this is not a true BPF program. Seccomp-bpf is a subset of BPF and
/// so many instructions are not available.
///
/// When executing instructions, the BPF program operates on the syscall
/// information made available as a (read-only) buffer of the following form:
///
/// ```no_compile
/// struct seccomp_data {
/// // The syscall number.
/// nr: u32,
/// // `AUDIT_ARCH_*` value (see `<linux/audit.h`).
/// arch: u32,
/// // CPU instruction pointer.
/// instruction_pointer: u64,
/// // Up to 6 syscall arguments.
/// args: [u64; 8],
/// }
/// ```
///
/// # Example
///
/// This filter will allow only the specified syscalls.
/// ```
/// let _filter = seccomp_bpf![
/// // Make sure the target process is using the x86-64 syscall ABI.
/// VALIDATE_ARCH(AUDIT_ARCH_X86_64),
/// // Load the current syscall number into `seccomp_data.nr`.
/// LOAD_SYSCALL_NR,
/// // Check if `seccomp_data.nr` matches the given syscalls. If so, then return
/// // from the seccomp filter early, allowing the syscall to continue.
/// SYSCALL(Sysno::open, ALLOW),
/// SYSCALL(Sysno::close, ALLOW),
/// SYSCALL(Sysno::write, ALLOW),
/// SYSCALL(Sysno::read, ALLOW),
/// // Deny all other syscalls by having the kernel kill the current thread with
/// // `SIGSYS`.
/// DENY,
/// ];
/// ```
#[cfg(test)]
macro_rules! seccomp_bpf {
($($inst:expr),+ $(,)?) => {
{
let mut filter = Filter::new();
$(
$inst.into_bpf(&mut filter);
)+
filter
}
};
}
// See: /include/uapi/linux/filter.h
pub const fn BPF_STMT(code: u16, k: u32) -> sock_filter {
sock_filter {
code,
jt: 0,
jf: 0,
k,
}
}
/// A BPF jump instruction.
///
/// # Arguments
///
/// * `code` is the operation code.
/// * `k` is the value operated on for comparisons.
/// * `jt` is the relative offset to jump to if the comparison is true.
/// * `jf` is the relative offset to jump to if the comparison is false.
///
/// # Example
///
/// ```no_compile
/// // Jump to the next instruction if the loaded value is equal to 42.
/// BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, 42, 1, 0);
/// ```
pub const fn BPF_JUMP(code: u16, k: u32, jt: u8, jf: u8) -> sock_filter {
sock_filter { code, jt, jf, k }
}
/// Loads the syscall number into `seccomp_data.nr`.
pub const LOAD_SYSCALL_NR: sock_filter = BPF_STMT(BPF_LD + BPF_W + BPF_ABS, SECCOMP_DATA_OFFSET_NR);
/// Returns from the seccomp filter, allowing the syscall to pass through.
#[allow(unused)]
pub const ALLOW: sock_filter = BPF_STMT(BPF_RET + BPF_K, libc::SECCOMP_RET_ALLOW);
/// Returns from the seccomp filter, instructing the kernel to kill the calling
/// thread with `SIGSYS` before executing the syscall.
#[allow(unused)]
pub const DENY: sock_filter = BPF_STMT(BPF_RET + BPF_K, libc::SECCOMP_RET_KILL_THREAD);
/// Returns from the seccomp filter, causing a `SIGSYS` to be sent to the calling
/// thread skipping over the syscall without executing it. Unlike [`DENY`], this
/// signal can be caught.
#[allow(unused)]
pub const TRAP: sock_filter = BPF_STMT(BPF_RET + BPF_K, libc::SECCOMP_RET_TRAP);
/// Returns from the seccomp filter, causing `PTRACE_EVENT_SECCOMP` to be
/// generated for this syscall (if `PTRACE_O_TRACESECCOMP` is enabled). If no
/// tracer is present, the syscall will not be executed and returns a `ENOSYS`
/// instead.
///
/// `data` is made available to the tracer via `PTRACE_GETEVENTMSG`.
#[allow(unused)]
pub fn TRACE(data: u16) -> sock_filter {
BPF_STMT(
BPF_RET + BPF_K,
libc::SECCOMP_RET_TRACE | (data as u32 & libc::SECCOMP_RET_DATA),
)
}
/// Returns from the seccomp filter, returning the given error instead of
/// executing the syscall.
#[allow(unused)]
pub fn ERRNO(err: Errno) -> sock_filter {
BPF_STMT(
BPF_RET + BPF_K,
libc::SECCOMP_RET_ERRNO | (err.into_raw() as u32 & libc::SECCOMP_RET_DATA),
)
}
macro_rules! instruction {
(
$(
$(#[$attrs:meta])*
$vis:vis fn $name:ident($($args:tt)*) {
$($instruction:expr;)*
}
)*
) => {
$(
$vis fn $name($($args)*) -> impl ByteCode {
move |filter: &mut Filter| {
$(
$instruction.into_bpf(filter);
)*
}
}
)*
};
}
instruction! {
/// Checks that architecture matches our target architecture. If it does not
/// match, kills the current process. This should be the first step for every
/// seccomp filter to ensure we're working with the syscall table we're
/// expecting. Each architecture has a slightly different syscall table and
/// we need to make sure the syscall numbers we're using are the right ones
/// for the architecture.
pub fn VALIDATE_ARCH(target_arch: u32) {
// Load `seccomp_data.arch`
BPF_STMT(BPF_LD + BPF_W + BPF_ABS, SECCOMP_DATA_OFFSET_ARCH);
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, target_arch, 1, 0);
BPF_STMT(BPF_RET + BPF_K, libc::SECCOMP_RET_KILL_PROCESS);
}
pub fn LOAD_SYSCALL_IP() {
BPF_STMT(BPF_LD + BPF_W + BPF_ABS, SECCOMP_DATA_OFFSET_IP_LO);
// M[0] = lo
BPF_STMT(BPF_ST, 0);
BPF_STMT(BPF_LD + BPF_W + BPF_ABS, SECCOMP_DATA_OFFSET_IP_HI);
// M[1] = hi
BPF_STMT(BPF_ST, 1);
}
/// Checks if `seccomp_data.nr` matches the given syscall. If so, then jumps
/// to `action`.
///
/// # Example
/// ```no_compile
/// SYSCALL(Sysno::socket, DENY);
/// ```
pub fn SYSCALL(nr: Sysno, action: sock_filter) {
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, nr as i32 as u32, 0, 1);
action;
}
fn IP_RANGE64(blo: u32, bhi: u32, elo: u32, ehi: u32, action: sock_filter) {
// Most of the complexity below is caused by seccomp-bpf only being able
// to operate on `u32` values. We also can't reuse `JGE64` and `JLE64`
// because the jump offsets would be incorrect.
// STEP1: if !(begin > arg) goto NOMATCH;
// if (begin_hi > arg.hi) goto Step2; */
BPF_JUMP(BPF_JMP + BPF_JGT + BPF_K, bhi, 4 /* goto STEP2 */, 0);
// if (begin_hi != arg.hi) goto NOMATCH;
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, bhi, 0, 9 /* goto NOMATCH */);
// Load M[0] to operate on the low bits of the IP.
BPF_STMT(BPF_LD + BPF_MEM, 0);
// if (begin_lo >= arg.lo) goto MATCH;
BPF_JUMP(BPF_JMP + BPF_JGE + BPF_K, blo, 0, 7 /* goto NOMATCH */);
// Load M[1] because the next instruction expects the high bits of the
// IP.
BPF_STMT(BPF_LD + BPF_MEM, 1);
// STEP2: if !(arg > end) goto NOMATCH;
// if (end_hi < arg.hi) goto MATCH;
BPF_JUMP(BPF_JMP + BPF_JGT + BPF_K, ehi, 0, 4 /* goto MATCH */);
// if (end_hi != arg.hi) goto NOMATCH;
BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ehi, 0, 5 /* goto NOMATCH */);
BPF_STMT(BPF_LD + BPF_MEM, 0);
// if (end_lo < arg.lo) goto MATCH;
BPF_JUMP(BPF_JMP + BPF_JGE + BPF_K, elo, 2 /* goto NOMATCH */, 0);
BPF_STMT(BPF_LD + BPF_MEM, 1);
// MATCH: Take the action.
action;
// NOMATCH: Load M[1] again after we loaded M[0].
BPF_STMT(BPF_LD + BPF_MEM, 1);
}
}
/// Checks if the instruction pointer is between a certain range. If so, executes
/// `action`. Otherwise, fall through.
///
/// Note that if `ip == end`, this will not match. That is, the interval closed
/// at the end.
///
/// Precondition: The instruction pointer must be loaded with [`LOAD_SYSCALL_IP`]
/// first.
pub fn IP_RANGE(begin: u64, end: u64, action: sock_filter) -> impl ByteCode {
let begin_lo = begin as u32;
let begin_hi = (begin >> 32) as u32;
let end_lo = end as u32;
let end_hi = (end >> 32) as u32;
IP_RANGE64(begin_lo, begin_hi, end_lo, end_hi, action)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn smoke() {
let filter = seccomp_bpf![
VALIDATE_ARCH(AUDIT_ARCH_X86_64),
LOAD_SYSCALL_NR,
SYSCALL(Sysno::open, DENY),
SYSCALL(Sysno::close, DENY),
SYSCALL(Sysno::write, DENY),
SYSCALL(Sysno::read, DENY),
ALLOW,
];
assert_eq!(filter.len(), 13);
}
}

View file

@ -0,0 +1,338 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
//! Provides helpers for constructing a [`seccomp`][seccomp] filter. This is a
//! pure Rust implementation and does not require libseccomp.
//!
//! # Seccomp Background
//!
//! [`seccomp(2)`][seccomp] is a powerful tool for changing how a process tree
//! behaves when a syscall happens. Seccomp can be used to install a filter that
//! applies to every child process in a process tree. Since filters cannot be
//! removed, they can only get more restrictive. The language used for filters is
//! called `seccomp-bpf`. It is a subset of the BPF byte code language.
//!
//! Some of the restrictions include:
//! - Only being able to JMP forward and never backward. This prevents loops and
//! ensures seccomp-bpf filters always terminate. This is also true of BPF.
//! - Cannot call libbpf functions.
//! - Cannot operate on 64-bit integers, only 32-bit integers.
//!
//! [seccomp]: https://man7.org/linux/man-pages/man2/seccomp.2.html
//!
//! You can think of a seccomp-bpf program as a little function that gets
//! executed for every syscall:
//!
//! ```no_compile
//! // NOTE: seccomp-bpf programs are actually written in byte code, but if a
//! // high-level language could be compiled to BPF byte code, this is what it'd
//! // look like.
//! fn my_program(data: seccomp_data) -> Action {
//! if data.nr == 2 {
//! return Action::Trace;
//! }
//!
//! if data.nr == 3 {
//! return Action::KillProcess;
//! }
//!
//! // Allow the syscall by default.
//! Action::Allow
//! }
//! ```
//!
//! where `seccomp_data` is defined as:
//!
//! ```no_compile
//! struct seccomp_data {
//! // The syscall number.
//! nr: u32,
//! // The architecture.
//! arch: u32,
//! // Instruction pointer.
//! ip: u64,
//! // The 6 syscall arguments.
//! args: [u64; 6],
//! }
//! ```
//!
//! This is the only input available to the seccomp filter and is the only bit of
//! data available to make a decision about a syscall (i.e., an "action"). An
//! action might be nothing (i.e., allow the syscall through), kill the
//! process/thread with `SIGSYS`, forward the syscall to ptrace, or return an
//! error code.
#[macro_use]
mod bpf;
use bpf::*;
use syscalls::Errno;
use syscalls::Sysno;
pub use bpf::Filter;
use std::collections::BTreeMap;
/// Builder for creating seccomp filters.
#[derive(Clone)]
pub struct FilterBuilder {
/// The target architecture.
target_arch: TargetArch,
/// The action to take if there are no matches.
default_action: Action,
/// The action to take for each syscall.
syscalls: BTreeMap<Sysno, Action>,
/// Ranges of instruction pointer values.
ip_ranges: Vec<(u64, u64, Action)>,
}
/// The target architecture.
#[allow(non_camel_case_types, missing_docs)]
#[derive(Debug, Copy, Clone)]
#[repr(u32)]
pub enum TargetArch {
x86 = AUDIT_ARCH_X86,
x86_64 = AUDIT_ARCH_X86_64,
mips = AUDIT_ARCH_MIPS,
powerpc = AUDIT_ARCH_PPC,
powerpc64 = AUDIT_ARCH_PPC64,
arm = AUDIT_ARCH_ARM,
aarch64 = AUDIT_ARCH_AARCH64,
}
/// The action to take if the conditions of a rule all match.
#[derive(Debug, Copy, Clone)]
pub enum Action {
/// Allows the syscallto be executed.
Allow,
/// Returns the specified error instead of executing the syscall.
Errno(Errno),
/// Prevents the syscall from being executed and the kernel will kill the
/// calling thread with `SIGSYS`.
KillThread,
/// Prevents the syscall from being executed and the kernel will kill the
/// calling process with `SIGSYS`.
KillProcess,
/// Same as [`Action::Allow`] but logs the call.
Log,
/// If the thread is being ptraced and the tracing process specified
/// `PTRACE_O_SECCOMP`, the tracing process will be notified via
/// `PTRACE_EVENT_SECCOMP` and the value provided can be retrieved using
/// `PTRACE_GETEVENTMSG`.
Trace(u16),
/// Disallow and raise a SIGSYS in the calling process.
Trap,
}
impl From<Action> for u32 {
fn from(action: Action) -> u32 {
match action {
Action::Allow => libc::SECCOMP_RET_ALLOW,
Action::Errno(x) => {
libc::SECCOMP_RET_ERRNO | (x.into_raw() as u32 & libc::SECCOMP_RET_DATA)
}
Action::KillThread => libc::SECCOMP_RET_KILL_THREAD,
Action::KillProcess => libc::SECCOMP_RET_KILL_PROCESS,
Action::Log => libc::SECCOMP_RET_LOG,
Action::Trace(x) => libc::SECCOMP_RET_TRACE | (x as u32 & libc::SECCOMP_RET_DATA),
Action::Trap => libc::SECCOMP_RET_TRAP,
}
}
}
impl From<Action> for sock_filter {
fn from(action: Action) -> sock_filter {
BPF_STMT(BPF_RET + BPF_K, u32::from(action))
}
}
impl TargetArch {
#![allow(missing_docs)]
#[cfg(target_arch = "x86")]
pub const CURRENT: TargetArch = Self::x86;
#[cfg(target_arch = "x86_64")]
pub const CURRENT: TargetArch = Self::x86_64;
#[cfg(target_arch = "mips")]
pub const CURRENT: TargetArch = Self::mips;
#[cfg(target_arch = "powerpc")]
pub const CURRENT: TargetArch = Self::powerpc;
#[cfg(target_arch = "powerpc64")]
pub const CURRENT: TargetArch = Self::powerpc64;
#[cfg(target_arch = "arm")]
pub const CURRENT: TargetArch = Self::arm;
#[cfg(target_arch = "aarch64")]
pub const CURRENT: TargetArch = Self::aarch64;
}
impl Default for TargetArch {
fn default() -> Self {
Self::CURRENT
}
}
impl Default for FilterBuilder {
fn default() -> Self {
Self::new()
}
}
impl FilterBuilder {
/// Creates the seccomp filter builder.
pub fn new() -> Self {
Self {
target_arch: TargetArch::default(),
default_action: Action::KillThread,
syscalls: Default::default(),
ip_ranges: Default::default(),
}
}
/// Sets the target architecture. If this doesn't match the architecture of
/// the process, then the process is killed. This is the first step in the
/// seccomp filter and ensures that we're working with the right syscall
/// table. Each architecture has a slightly different syscall table and we
/// need to make sure the syscall numbers we're using are the right ones for
/// the architecture.
///
/// By default, the target architecture is set to the architecture of the
/// current program (i.e., `TargetArch::CURRENT`).
pub fn target_arch(&mut self, target_arch: TargetArch) -> &mut Self {
self.target_arch = target_arch;
self
}
/// The default action to take if there are no matches. By default, the
/// default action is to kill the current thread (i.e., the filter becomes an
/// allowlist).
///
/// When using an allowlist of syscalls, this should be set to
/// `Action::KillThread` or `Action::KillProcess`.
///
/// When using a blocklist of syscalls, this should be set to
/// `Action::Allow`.
pub fn default_action(&mut self, action: Action) -> &mut Self {
self.default_action = action;
self
}
/// Sets the action to take for the given syscall.
pub fn syscall(&mut self, syscall: Sysno, action: Action) -> &mut Self {
self.syscalls.insert(syscall, action);
self
}
/// Sets the action to take for a set of syscalls.
pub fn syscalls<I>(&mut self, table: I) -> &mut Self
where
I: IntoIterator<Item = (Sysno, Action)>,
{
self.syscalls.extend(table);
self
}
/// Take an action if the instruction pointer `ip >= begin && ip < end`.
///
/// This is useful in conjunction with `mmap`. For example, we can use this
/// to deny any syscalls made outside of `ld.so` or `libc.so`. It can also be
/// used to avoid tracing syscalls injected with ptrace.
///
/// Multiple ranges can be added and are checked in sequence.
pub fn ip_range(&mut self, begin: u64, end: u64, action: Action) -> &mut Self {
self.ip_ranges.push((begin, end, action));
self
}
/// Adds multiple IP ranges. This is equivalent to calling
/// [`FilterBuilder::ip_range`] multiple times.
pub fn ip_ranges<I>(&mut self, ranges: I) -> &mut Self
where
I: IntoIterator<Item = (u64, u64, Action)>,
{
self.ip_ranges.extend(ranges);
self
}
/// Generates the byte code for the filter.
pub fn build(&self) -> Filter {
let mut filter = Filter::new();
// This should be the first step for every seccomp-bpf filter.
VALIDATE_ARCH(self.target_arch as u32).into_bpf(&mut filter);
if !self.ip_ranges.is_empty() {
LOAD_SYSCALL_IP().into_bpf(&mut filter);
for (begin, end, action) in &self.ip_ranges {
IP_RANGE(*begin, *end, (*action).into()).into_bpf(&mut filter);
}
}
if !self.syscalls.is_empty() {
// Load the syscall number.
LOAD_SYSCALL_NR.into_bpf(&mut filter);
for (syscall, action) in &self.syscalls {
SYSCALL(*syscall, (*action).into()).into_bpf(&mut filter);
}
}
// The default action is always performed last.
sock_filter::from(self.default_action).into_bpf(&mut filter);
filter
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn smoke() {
assert_eq!(
FilterBuilder::new()
.default_action(Action::Allow)
.target_arch(TargetArch::x86_64)
.syscalls([
(Sysno::read, Action::KillThread),
(Sysno::write, Action::KillThread),
(Sysno::open, Action::KillThread),
(Sysno::close, Action::KillThread),
(Sysno::write, Action::KillThread),
])
.build(),
seccomp_bpf![
VALIDATE_ARCH(AUDIT_ARCH_X86_64),
LOAD_SYSCALL_NR,
SYSCALL(Sysno::read, DENY),
SYSCALL(Sysno::write, DENY),
SYSCALL(Sysno::open, DENY),
SYSCALL(Sysno::close, DENY),
ALLOW,
]
);
}
}

View file

@ -0,0 +1,142 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use super::clone::clone;
use super::error::{Context, Error};
use super::fd::{pipe, Fd};
use super::id_map::make_id_map;
use super::stdio::{ChildStderr, ChildStdin, ChildStdout};
use super::util::CStringArray;
use super::Child;
use super::Command;
use super::container::ChildContext;
use std::io;
use std::io::Write;
impl Command {
/// Executes the command as a child process, returning a handle to it.
///
/// By default, stdin, stdout and stderr are inherited from the parent.
pub fn spawn(&mut self) -> Result<Child, Error> {
// Create a pipe to send back errors to the parent process if `execve`
// fails.
let (reader, mut writer) = pipe()?;
let child = self.spawn_with(|err| {
send_error(&mut writer, err);
1
})?;
// Close the writer end. Otherwise, the following read will hang
// forever.
drop(writer);
recv_error(reader)?;
Ok(child)
}
/// Spawn the child with helper functions. The `onfail` callback runs in the
/// child process if an error occurs during execution of the process. The
/// `wait` function can be used to wait for the child to fully start up and
/// to transform it into another type.
pub fn spawn_with<F>(&mut self, mut onfail: F) -> Result<Child, Error>
where
F: FnMut(Error) -> i32,
{
let env = self.container.env.array();
// Set up IO pipes
let (stdin, child_stdin) = self.container.stdin.pipes(true)?;
let (stdout, child_stdout) = self.container.stdout.pipes(false)?;
let (stderr, child_stderr) = self.container.stdout.pipes(false)?;
let clone_flags = self.container.namespace.bits() | libc::SIGCHLD;
let uid_map = &make_id_map(&self.container.uid_map);
let gid_map = &make_id_map(&self.container.gid_map);
let context = ChildContext {
stdin: child_stdin.as_ref(),
stdout: child_stdout.as_ref(),
stderr: child_stderr.as_ref(),
uid_map,
gid_map,
};
let pid = clone(
|| {
let code = onfail(self.do_exec(&context, &env).unwrap_err());
unsafe { libc::_exit(code) }
},
clone_flags,
)?;
drop(child_stdin);
drop(child_stdout);
drop(child_stderr);
drop(self.container.pty.take());
let stdin = stdin.map(ChildStdin::new).transpose()?;
let stdout = stdout.map(ChildStdout::new).transpose()?;
let stderr = stderr.map(ChildStderr::new).transpose()?;
Ok(Child {
pid,
exit_status: None,
stdin,
stdout,
stderr,
})
}
/// Note: This function MUST NOT allocate or deallocate any memory. Doing so
/// can cause deadlocks.
fn do_exec(&mut self, context: &ChildContext, env: &CStringArray) -> Result<!, Error> {
self.container.setup(context, &mut self.pre_exec)?;
let err = Error::result(
unsafe { libc::execvpe(self.program.as_ptr(), self.args.as_ptr(), env.as_ptr()) },
Context::Exec,
)
.unwrap_err();
Err(err)
}
}
/// Sends an error and closes the pipe. Ignore any errors if this fails.
pub fn send_error(fd: &mut Fd, err: Error) {
// Writes up to PIPE_BUF (4096) should be atomic. There's also nothing we
// can do with an error if this fails.
let bytes: [u8; 8] = err.into();
let _ = fd.write(&bytes);
}
/// Tries to receive an error code from the pipe. If the other end of the
/// pipe is closed before sending an error, then `Ok(())` is returned.
pub fn recv_error(mut fd: Fd) -> Result<(), Error> {
use std::io::Read;
let mut err = [0u8; 8];
loop {
match fd.read(&mut err) {
Ok(0) => return Ok(()),
Ok(8) => return Err(Error::from(err)),
Ok(n) => {
// Sends up to PIPE_BUF (4096) should be atomic.
panic!("execve pipe: got unexpected number of bytes {}", n);
}
Err(err) if err.kind() == io::ErrorKind::Interrupted => {}
Err(err) => {
panic!("execve pipe: read returned unexpected error {}", err);
}
}
}
}

View file

@ -0,0 +1,240 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use super::fd::{pipe, AsyncFd, Fd};
use core::pin::Pin;
use core::task::{Context, Poll};
use std::io;
use std::os::unix::io::{AsRawFd, FromRawFd, IntoRawFd};
use syscalls::Errno;
use tokio::io::{AsyncRead, AsyncWrite, ReadBuf};
/// Describes what to do with a standard I/O stream for a child process when
/// passed to the [`stdin`], [`stdout`], and [`stderr`] methods of [`Command`].
///
/// [`stdin`]: super::Command::stdin
/// [`stdout`]: super::Command::stdout
/// [`stderr`]: super::Command::stderr
/// [`Command`]: super::Command
#[derive(Debug)]
pub struct Stdio(InnerStdio);
/// A handle to a child process's standard input (stdin).
///
/// This struct is used in the [`stdin`] field on [`Child`].
///
/// When an instance of `ChildStdin` is [dropped], the `ChildStdin`'s underlying
/// file handle will be closed. If the child process was blocked on input prior
/// to being dropped, it will become unblocked after dropping.
///
/// [`stdin`]: super::Child::stdin
/// [`Child`]: super::Child
/// [dropped]: Drop
#[derive(Debug)]
pub struct ChildStdin(AsyncFd);
/// A handle to a child process's standard output (stdout).
///
/// This struct is used in the [`stdout`] field on [`Child`].
///
/// When an instance of `ChildStdout` is [dropped], the `ChildStdout`'s
/// underlying file handle will be closed.
///
/// [`stdout`]: super::Child::stdout
/// [`Child`]: super::Child
/// [dropped]: Drop
#[derive(Debug)]
pub struct ChildStdout(AsyncFd);
/// A handle to a child process's stderr.
///
/// This struct is used in the [`stderr`] field on [`Child`].
///
/// When an instance of `ChildStderr` is [dropped], the `ChildStderr`'s
/// underlying file handle will be closed.
///
/// [`stderr`]: super::Child::stderr
/// [`Child`]: super::Child
/// [dropped]: Drop
#[derive(Debug)]
pub struct ChildStderr(AsyncFd);
#[derive(Debug)]
enum InnerStdio {
Inherit,
Null,
Piped,
File(Fd),
}
impl Default for Stdio {
fn default() -> Self {
Self(InnerStdio::Inherit)
}
}
impl Stdio {
/// A new pipe should be arranged to connect the parent and child processes.
pub fn piped() -> Self {
Self(InnerStdio::Piped)
}
/// The child inherits from the corresponding parent descriptor. This is the default mode.
pub fn inherit() -> Self {
Self(InnerStdio::Inherit)
}
/// This stream will be ignored. This is the equivalent of attaching the
/// stream to `/dev/null`.
pub fn null() -> Self {
Self(InnerStdio::Null)
}
/// Returns a pair of file descriptors, one for the parent and one for the
/// child. If the child's file descriptor is `None`, then it shall be
/// inherited from the parent. If the parent's file descriptor is `None`,
/// then there is no link to the child and the child owns the other half of
/// the file descriptor (if any). Both file descriptors will be `None` if
/// stdio is being inherited.
pub(super) fn pipes(&self, readable: bool) -> Result<(Option<Fd>, Option<Fd>), Errno> {
match &self.0 {
InnerStdio::Inherit => Ok((None, None)),
InnerStdio::Null => Ok((None, Some(Fd::null(readable)?))),
InnerStdio::Piped => {
let (reader, writer) = pipe()?;
let (parent, child) = if readable {
(writer, reader)
} else {
(reader, writer)
};
Ok((Some(parent), Some(child)))
}
InnerStdio::File(file) => Ok((None, Some(file.dup()?))),
}
}
}
impl<T: IntoRawFd> From<T> for Stdio {
fn from(f: T) -> Self {
Self(InnerStdio::File(Fd::new(f.into_raw_fd())))
}
}
impl From<Stdio> for std::process::Stdio {
fn from(stdio: Stdio) -> Self {
match stdio.0 {
InnerStdio::Inherit => Self::inherit(),
InnerStdio::Null => Self::null(),
InnerStdio::Piped => Self::piped(),
InnerStdio::File(fd) => Self::from(std::fs::File::from(fd)),
}
}
}
impl ChildStdin {
pub(super) fn new(fd: Fd) -> Result<Self, Errno> {
AsyncFd::writable(fd).map(Self)
}
}
impl ChildStdout {
pub(super) fn new(fd: Fd) -> Result<Self, Errno> {
AsyncFd::readable(fd).map(Self)
}
}
impl ChildStderr {
pub(super) fn new(fd: Fd) -> Result<Self, Errno> {
AsyncFd::readable(fd).map(Self)
}
}
impl AsyncWrite for ChildStdin {
fn poll_write(
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
buf: &[u8],
) -> Poll<tokio::io::Result<usize>> {
Pin::new(&mut self.0).poll_write(cx, buf)
}
fn poll_flush(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<io::Result<()>> {
Pin::new(&mut self.0).poll_flush(cx)
}
fn poll_shutdown(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<io::Result<()>> {
Pin::new(&mut self.0).poll_shutdown(cx)
}
}
impl AsyncRead for ChildStdout {
fn poll_read(
mut self: Pin<&mut Self>,
cx: &mut Context,
buf: &mut ReadBuf,
) -> Poll<tokio::io::Result<()>> {
Pin::new(&mut self.0).poll_read(cx, buf)
}
}
impl AsyncRead for ChildStderr {
fn poll_read(
mut self: Pin<&mut Self>,
cx: &mut Context,
buf: &mut ReadBuf,
) -> Poll<tokio::io::Result<()>> {
Pin::new(&mut self.0).poll_read(cx, buf)
}
}
impl FromRawFd for ChildStdin {
unsafe fn from_raw_fd(fd: i32) -> Self {
Self::new(Fd::new(fd)).unwrap()
}
}
impl FromRawFd for ChildStdout {
unsafe fn from_raw_fd(fd: i32) -> Self {
Self::new(Fd::new(fd)).unwrap()
}
}
impl FromRawFd for ChildStderr {
unsafe fn from_raw_fd(fd: i32) -> Self {
Self::new(Fd::new(fd)).unwrap()
}
}
impl From<tokio::process::ChildStdin> for ChildStdin {
fn from(io: tokio::process::ChildStdin) -> Self {
let fd = io.as_raw_fd();
let fd = unsafe { libc::dup(fd) };
drop(io);
unsafe { Self::from_raw_fd(fd) }
}
}
impl From<tokio::process::ChildStdout> for ChildStdout {
fn from(io: tokio::process::ChildStdout) -> Self {
let fd = io.as_raw_fd();
let fd = unsafe { libc::dup(fd) };
drop(io);
unsafe { Self::from_raw_fd(fd) }
}
}
impl From<tokio::process::ChildStderr> for ChildStderr {
fn from(io: tokio::process::ChildStderr) -> Self {
let fd = io.as_raw_fd();
let fd = unsafe { libc::dup(fd) };
drop(io);
unsafe { Self::from_raw_fd(fd) }
}
}

View file

@ -0,0 +1,81 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use std::ffi::{CStr, CString, OsStr};
use std::os::unix::ffi::OsStrExt;
use syscalls::Errno;
pub fn to_cstring<S: AsRef<OsStr>>(s: S) -> CString {
CString::new(s.as_ref().as_bytes()).unwrap()
}
pub struct CStringArray {
items: Vec<CString>,
ptrs: Vec<*const libc::c_char>,
}
impl CStringArray {
pub fn with_capacity(capacity: usize) -> Self {
let mut result = CStringArray {
items: Vec::with_capacity(capacity),
ptrs: Vec::with_capacity(capacity + 1),
};
result.ptrs.push(core::ptr::null());
result
}
pub fn push(&mut self, item: CString) {
let l = self.ptrs.len();
self.ptrs[l - 1] = item.as_ptr();
self.ptrs.push(core::ptr::null());
self.items.push(item);
}
pub fn as_ptr(&self) -> *const *const libc::c_char {
self.ptrs.as_ptr()
}
pub fn set(&mut self, i: usize, item: CString) {
self.ptrs[i] = item.as_ptr();
self.items[i] = item;
}
pub fn get(&self, i: usize) -> &CStr {
self.items[i].as_ref()
}
pub fn iter(&self) -> impl Iterator<Item = &CStr> {
self.items.iter().map(|x| x.as_ref())
}
}
pub unsafe fn reset_signal_handling() -> Result<(), Errno> {
use core::mem::MaybeUninit;
// Reset signal handling so the child process starts in a standardized
// state. libstd ignores SIGPIPE, and signal-handling libraries often set a
// mask. Child processes inherit ignored signals and the signal mask from
// their parent, but most UNIX programs do not reset these things on their
// own, so we need to clean things up now to avoid confusing the program
// we're about to run.
let mut set = MaybeUninit::<libc::sigset_t>::uninit();
Errno::result(libc::sigemptyset(set.as_mut_ptr()))?;
Errno::result(libc::pthread_sigmask(
libc::SIG_SETMASK,
set.as_ptr(),
core::ptr::null_mut(),
))?;
let ret = libc::signal(libc::SIGPIPE, libc::SIG_DFL);
if ret == libc::SIG_ERR {
return Err(Errno::last());
}
Ok(())
}

38
reverie-ptrace/Cargo.toml Normal file
View file

@ -0,0 +1,38 @@
# @generated by autocargo
[package]
name = "reverie-ptrace"
version = "0.1.0"
authors = ["Facebook"]
edition = "2021"
license = "BSD-2-Clause"
[dependencies]
anyhow = "1.0.51"
async-trait = "0.1.51"
bincode = "1.3.3"
bitflags = "1.3"
bytes = { version = "1.1", features = ["serde"] }
futures = { version = "0.3.13", features = ["async-await", "compat"] }
goblin = "0.3"
lazy_static = "1.0"
libc = "0.2.98"
nix = "0.22"
num-traits = "0.2"
parking_lot = { version = "0.11.2", features = ["send_guard"] }
paste = "1.0"
perf-event-open-sys = "1.0"
procfs = "0.9"
raw-cpuid = "9.0"
reverie = { version = "0.1.0", path = "../reverie" }
serde = { version = "1.0.126", features = ["derive", "rc"] }
thiserror = "1.0.29"
tokio = { version = "1.10", features = ["full", "test-util", "tracing"] }
tokio-stream = { version = "0.1.4", features = ["fs", "io-util", "net", "signal", "sync", "time"] }
tracing = "0.1.29"
tracing-subscriber = { version = "0.3.3", features = ["ansi", "env-filter", "fmt", "json", "parking_lot", "registry"] }
unwind = { version = "0.4", features = ["ptrace"] }
[dev-dependencies]
quickcheck = "1.0"
quickcheck_macros = "1.0"

View file

@ -0,0 +1,100 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use std::future::Future;
use std::marker::Unpin;
use std::mem;
use std::pin::Pin;
use std::slice;
use std::task::{Context, Poll};
use futures::future::FutureExt;
/// Represents a set of children.
#[derive(Clone, Default)]
pub struct Children<T> {
inner: Vec<T>,
}
impl<'a, T> IntoIterator for &'a Children<T> {
type Item = &'a T;
type IntoIter = slice::Iter<'a, T>;
fn into_iter(self) -> slice::Iter<'a, T> {
self.inner.iter()
}
}
impl<'a, T> IntoIterator for &'a mut Children<T> {
type Item = &'a mut T;
type IntoIter = slice::IterMut<'a, T>;
fn into_iter(self) -> slice::IterMut<'a, T> {
self.inner.iter_mut()
}
}
#[allow(unused)]
impl<T> Children<T> {
pub fn new() -> Self {
Children { inner: Vec::new() }
}
pub fn push(&mut self, item: T) {
self.inner.push(item);
}
pub fn is_empty(&self) -> bool {
self.inner.is_empty()
}
pub fn len(&self) -> usize {
self.inner.len()
}
pub fn into_inner(self) -> Vec<T> {
self.inner
}
pub fn take_inner(&mut self) -> Vec<T> {
mem::take(&mut self.inner)
}
pub fn retain<F>(&mut self, f: F)
where
F: FnMut(&T) -> bool,
{
self.inner.retain(f)
}
}
impl<T> Future for Children<T>
where
T: Future + Unpin,
{
// (Orphans, Finished)
type Output = (Self, Vec<T::Output>);
fn poll(mut self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
let mut inner = mem::take(&mut self.inner);
let mut ready = Vec::new();
// Iterate backwards through the vec. If an item is ready, swap_remove
// it. It is important to iterate backwards so that swap_remove doesn't
// perturb the ordering on the part of the vec we haven't yet iterated
// over.
for i in (0..self.inner.len()).rev() {
if let Poll::Ready(x) = inner[i].poll_unpin(cx) {
inner.swap_remove(i);
ready.push(x);
}
}
Poll::Ready((Children { inner }, ready))
}
}

View file

@ -0,0 +1,22 @@
/*
* Copyright (c) 2018-2019, Trustees of Indiana University
* ("University Works" via Baojun Wang)
* Copyright (c) 2018-2019, Ryan Newton
* ("Traditional Works of Scholarship")
* Copyright (c) 2020-, Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
/// A page that is reserved by Reverie in every guest process.
pub const PRIVATE_PAGE_OFFSET: u64 = 0x7000_0000;
/// trampoline data from private pages
pub const TRAMPOLINE_BASE: u64 = PRIVATE_PAGE_OFFSET;
pub const TRAMPOLINE_SIZE: usize = 0x1000;
/// total private page size
pub const PRIVATE_PAGE_SIZE: usize = TRAMPOLINE_SIZE;

View file

@ -0,0 +1,40 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use super::consts::*;
use nix::{
sys::uio::{self, IoVec, RemoteIoVec},
unistd::Pid,
};
/// generate syscall instructions at injected page
/// the page address should be 0x7000_0000 (PRIVATE_PAGE_OFFSET)
/// the byte code can be confirmed by running objcopy
/// x86_64-linux-gnu-objcopy -I binary /tmp/1.bin -O elf64-x86-64 -B i386:x86-64 /tmp/1.elf
/// then objdump -d 1.elf must match the instructions listed below.
pub fn populate_mmap_page(pid: Pid, page_address: u64) -> nix::Result<()> {
/* the syscall sequences used here:
* 0: 0f 05 syscall // untraced syscall
* 2: 0f 0b ud2
* 4: 0f 05 syscall // traced syscall
* 6: 0f 0b ud2
*/
let mut syscall_stubs: Vec<u8> = vec![0x0f, 0x05, 0x0f, 0x0b, 0x0f, 0x05, 0x0f, 0x0b];
syscall_stubs.resize_with(TRAMPOLINE_SIZE, || 0xcc);
let local_iov = &[IoVec::from_slice(syscall_stubs.as_slice())];
let remote_iov = &[RemoteIoVec {
base: page_address as usize,
len: TRAMPOLINE_SIZE,
}];
// initialize the whole page with int3 to prevent unintended
// execution in our injected page.
uio::process_vm_writev(pid, local_iov, remote_iov)?;
Ok(())
}

View file

@ -0,0 +1,14 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
mod consts;
mod mmap;
pub use consts::*;
pub use mmap::populate_mmap_page;

252
reverie-ptrace/src/debug.rs Normal file
View file

@ -0,0 +1,252 @@
/*
* Copyright (c) 2018-2019, Trustees of Indiana University
* ("University Works" via Baojun Wang)
* Copyright (c) 2018-2019, Ryan Newton
* ("Traditional Works of Scholarship")
* Copyright (c) 2020-, Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
//! convenient functions for debugging tracees
use core::fmt;
use nix::sys::{ptrace, signal};
use reverie::syscalls::{Addr, MemoryAccess};
use reverie::Pid;
use tracing::debug;
use crate::trace::Stopped;
// TODO: could check whether or not stack is valid
fn show_stackframe(tid: Pid, stack: u64, top_size: usize, bot_size: usize) -> String {
let mut text = String::new();
if stack < top_size as u64 {
return text;
}
let sp_top = stack - top_size as u64;
let sp_bot = stack + bot_size as u64;
let mut sp = sp_top;
while sp <= sp_bot {
match ptrace::read(tid.into(), sp as ptrace::AddressType) {
Err(_) => break,
Ok(x) => {
if sp == stack {
text += &format!(" => {:12x}: {:16x}\n", sp, x);
} else {
text += &format!(" {:12x}: {:16x}\n", sp, x);
}
}
}
sp += 8;
}
text
}
fn show_user_regs(regs: &libc::user_regs_struct) -> String {
let mut res = String::new();
res += &format!(
"rax {:16x} rbx {:16x} rcx {:16x} rdx {:16x}\n",
regs.rax, regs.rbx, regs.rcx, regs.rdx
);
res += &format!(
"rsi {:16x} rdi {:16x} rbp {:16x} rsp {:16x}\n",
regs.rsi, regs.rdi, regs.rbp, regs.rsp
);
res += &format!(
" r8 {:16x} r9 {:16x} r10 {:16x} r11 {:16x}\n",
regs.r8, regs.r9, regs.r10, regs.r11
);
res += &format!(
"r12 {:16x} r13 {:16x} r14 {:16x} r15 {:16x}\n",
regs.r12, regs.r13, regs.r14, regs.r15
);
res += &format!("rip {:16x} eflags {:16x}\n", regs.rip, regs.eflags);
res += &format!(
"cs {:x} ss {:x} ds {:x} es {:x}\nfs {:x} gs {:x}",
regs.cs, regs.ss, regs.ds, regs.es, regs.fs, regs.gs
);
res
}
fn show_proc_maps(maps: &procfs::process::MemoryMap) -> String {
use procfs::process::MMapPath;
let mut res = String::new();
let fp = match &maps.pathname {
MMapPath::Path(path) => String::from(path.to_str().unwrap_or("")),
MMapPath::Vdso => String::from("[vdso]"),
MMapPath::Vvar => String::from("[vvar]"),
MMapPath::Vsyscall => String::from("[vsyscall]"),
MMapPath::Stack => String::from("[stack]"),
MMapPath::Other(s) => s.clone(),
_ => String::from(""),
};
let s = format!(
"{:x}-{:x} {} {:08x} {:02x}:{:02x} {}",
maps.address.0, maps.address.1, maps.perms, maps.offset, maps.dev.0, maps.dev.1, maps.inode
);
res.push_str(&s);
(0..=72 - s.len()).for_each(|_| res.push(' '));
res.push_str(&fp);
res
}
fn task_rip_is_valid(pid: Pid, rip: u64) -> bool {
let mut has_valid_rip = None;
if let Ok(mapping) = procfs::process::Process::new(pid.as_raw()).and_then(|p| p.maps()) {
has_valid_rip = mapping
.iter()
.find(|e| e.perms.contains('x') && e.address.0 <= rip && e.address.1 > rip + 0x10)
.cloned();
}
has_valid_rip.is_some()
}
// XXX: should limit nb calls to procfs.
/// show task fault context
pub fn show_fault_context(task: &Stopped, sig: signal::Signal) {
let regs = task.getregs().unwrap();
let siginfo = task.getsiginfo().unwrap();
debug!(
"{:?} got {:?} si_errno: {}, si_code: {}, regs\n{}",
task,
sig,
siginfo.si_errno,
siginfo.si_code,
show_user_regs(&regs)
);
debug!(
"stackframe from rsp@{:x}\n{}",
regs.rsp,
show_stackframe(task.pid(), regs.rsp, 0x40, 0x80)
);
if task_rip_is_valid(task.pid(), regs.rip) {
if let Some(addr) = Addr::from_raw(regs.rip as usize) {
let mut buf: [u8; 16] = [0; 16];
if task.read_exact(addr, &mut buf).is_ok() {
debug!("insn @{:x?} = {:02x?}", addr, buf);
}
}
} else {
debug!("insn @{:x?} = <invalid rip>", regs.rip);
}
procfs::process::Process::new(task.pid().as_raw())
.and_then(|p| p.maps())
.unwrap_or_else(|_| Vec::new())
.iter()
.for_each(|e| {
debug!("{}", show_proc_maps(e));
});
}
/// As a debugging aid, dump the current state of the guest in a readbale format.
/// If an optional snapshot of an earlier register state is provided, the results
/// will be printed a DIFF from that previous state.
pub fn log_guest_state(context_msg: &str, tid: Pid, old_regs: &Option<libc::user_regs_struct>) {
// TODO: could certainly derive this "diffing" functionality as a macro if
// there is a library for that.
let hdr = format!("{}: guest state (tid {}) has ...", context_msg, tid);
let cur = ptrace::getregs(tid.into()).unwrap();
match old_regs {
None => debug!("{} regs = {:?}", hdr, cur),
Some(old) => {
let mut msg = String::from(" DIFF in regs from prev (new/old): ");
let len1 = msg.len();
if cur.r15 != old.r15 {
msg.push_str(&format!("r15: {}/{} ", cur.r15, old.r15));
}
if cur.r14 != old.r14 {
msg.push_str(&format!("r14: {}/{} ", cur.r14, old.r14));
}
if cur.r13 != old.r13 {
msg.push_str(&format!("r13: {}/{} ", cur.r13, old.r13));
}
if cur.r12 != old.r12 {
msg.push_str(&format!("r12: {}/{} ", cur.r12, old.r12));
}
if cur.rbp != old.rbp {
msg.push_str(&format!("rbp: {}/{} ", cur.rbp, old.rbp));
}
if cur.rbx != old.rbx {
msg.push_str(&format!("rbx: {}/{} ", cur.rbx, old.rbx));
}
if cur.r11 != old.r11 {
msg.push_str(&format!("r11: {}/{} ", cur.r11, old.r11));
}
if cur.r10 != old.r10 {
msg.push_str(&format!("r10: {}/{} ", cur.r10, old.r10));
}
if cur.r9 != old.r9 {
msg.push_str(&format!("r9: {}/{} ", cur.r9, old.r9));
}
if cur.r8 != old.r8 {
msg.push_str(&format!("r8: {}/{} ", cur.r8, old.r8));
}
if cur.rax != old.rax {
msg.push_str(&format!("rax: {}/{} ", cur.rax, old.rax));
}
if cur.rcx != old.rcx {
msg.push_str(&format!("rcx: {}/{} ", cur.rcx, old.rcx));
}
if cur.rdx != old.rdx {
msg.push_str(&format!("rdx: {}/{} ", cur.rdx, old.rdx));
}
if cur.rsi != old.rsi {
msg.push_str(&format!("rsi: {}/{} ", cur.rsi, old.rsi));
}
if cur.rdi != old.rdi {
msg.push_str(&format!("rdi: {}/{} ", cur.rdi, old.rdi));
}
if cur.orig_rax != old.orig_rax {
msg.push_str(&format!("orig_rax: {}/{} ", cur.orig_rax, old.orig_rax));
}
if cur.rip != old.rip {
msg.push_str(&format!("rip: {}/{} ", cur.rip, old.rip));
}
if cur.cs != old.cs {
msg.push_str(&format!("cs: {}/{} ", cur.cs, old.cs));
}
if cur.eflags != old.eflags {
msg.push_str(&format!("eflags: {}/{} ", cur.eflags, old.eflags));
}
if cur.rsp != old.rsp {
msg.push_str(&format!("rsp: {}/{} ", cur.rsp, old.rsp));
}
if cur.ss != old.ss {
msg.push_str(&format!("ss: {}/{} ", cur.ss, old.ss));
}
if cur.fs_base != old.fs_base {
msg.push_str(&format!("fs_base: {}/{} ", cur.fs_base, old.fs_base));
}
if cur.gs_base != old.gs_base {
msg.push_str(&format!("gs_base: {}/{} ", cur.gs_base, old.gs_base));
}
if cur.ds != old.ds {
msg.push_str(&format!("ds: {}/{} ", cur.ds, old.ds));
}
if cur.es != old.es {
msg.push_str(&format!("es: {}/{} ", cur.es, old.es));
}
if cur.fs != old.fs {
msg.push_str(&format!("fs: {}/{} ", cur.fs, old.fs));
}
if cur.gs != old.gs {
msg.push_str(&format!("gs: {}/{} ", cur.gs, old.gs));
}
if msg.len() == len1 {
debug!("{} NO differences from prev register state.", hdr)
} else {
debug!("{} {}", hdr, msg);
}
}
}
}

View file

@ -0,0 +1,27 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use thiserror::Error;
use crate::trace;
/// A reverie-ptrace error. This error type isn't meant to be exposed to the
/// user.
#[derive(Error, Debug)]
pub enum Error {
/// An internal error that is only ever meant to be used as a reverie-ptrace
/// implementation detail. None of these errors should make it through to the
/// user.
#[error(transparent)]
Internal(#[from] trace::Error),
/// A public error.
#[error(transparent)]
External(#[from] reverie::Error),
}

View file

@ -0,0 +1,45 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
/// Breakpoint type
#[derive(PartialEq, Debug)]
pub enum BreakpointType {
/// Software breakpoint
Software,
/// Hardware breakpoint
Hardware,
/// Read watchpoint
ReadWatch,
/// Write watchpoint
WriteWatch,
}
impl BreakpointType {
pub fn new(ty: i32) -> Option<Self> {
match ty {
0 => Some(BreakpointType::Software),
1 => Some(BreakpointType::Hardware),
2 => Some(BreakpointType::ReadWatch),
3 => Some(BreakpointType::WriteWatch),
_ => None,
}
}
}
/// Breakpoint.
#[derive(PartialEq, Debug)]
pub struct Breakpoint {
/// Breakpoint type.
pub ty: BreakpointType,
/// Address to set breakpoint.
pub addr: u64,
/// Additional expression used to implement conditional breakpoints
/// See https://sourceware.org/gdb/current/onlinedocs/gdb/Bytecode-Descriptions.html.
pub bytecode: Option<Vec<u8>>,
}

View file

@ -0,0 +1,24 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct QStartNoAckMode;
impl ParseCommand for QStartNoAckMode {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
Some(QStartNoAckMode)
} else {
None
}
}
}

View file

@ -0,0 +1,32 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct QThreadEvents {
pub enable: bool,
}
impl ParseCommand for QThreadEvents {
fn parse(bytes: BytesMut) -> Option<Self> {
if !bytes.starts_with(b":") {
None
} else {
let value: u32 = decode_hex(&bytes[1..]).ok()?;
if value != 0 && value != 1 {
None
} else {
let enable = value == 1;
Some(QThreadEvents { enable })
}
}
}
}

View file

@ -0,0 +1,24 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct QuestionMark;
impl ParseCommand for QuestionMark {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
Some(QuestionMark)
} else {
None
}
}
}

View file

@ -0,0 +1,16 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use crate::gdbstub::commands::ParseCommand;
pub struct c {
pub addr: Option<u64>,
}

View file

@ -0,0 +1,33 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use reverie::Pid;
use crate::gdbstub::{commands::*, hex::*};
#[derive(PartialEq, Debug)]
pub struct D {
pub pid: Option<Pid>,
}
impl ParseCommand for D {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
Some(D { pid: None })
} else if !bytes.starts_with(b";") {
None
} else {
let pid = decode_hex(&bytes[1..]).ok()?;
Some(D {
pid: Some(Pid::from_raw(pid)),
})
}
}
}

View file

@ -0,0 +1,20 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct g;
impl ParseCommand for g {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() { Some(g) } else { None }
}
}

View file

@ -0,0 +1,27 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct G {
pub vals: Vec<u8>,
}
impl ParseCommand for G {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
None
} else {
let vals = decode_hex_string(&bytes).ok()?;
Some(G { vals })
}
}
}

View file

@ -0,0 +1,49 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct H {
pub op: ThreadOp,
pub id: ThreadId,
}
impl ParseCommand for H {
fn parse(mut bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
None
} else {
let (ch, bytes) = bytes.split_first_mut()?;
let op = match *ch {
b'c' => Some(ThreadOp::c),
b'g' => Some(ThreadOp::g),
b'G' => Some(ThreadOp::G),
b'm' => Some(ThreadOp::m),
b'M' => Some(ThreadOp::M),
_ => None,
}?;
if bytes == &b"-1"[..] {
Some(H {
op,
id: ThreadId::all(),
})
} else if bytes == &b"0"[..] {
Some(H {
op,
id: ThreadId::any(),
})
} else {
let thread_id = ThreadId::decode(bytes)?;
Some(H { op, id: thread_id })
}
}
}
}

View file

@ -0,0 +1,12 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
pub struct k;

View file

@ -0,0 +1,31 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use crate::gdbstub::{commands::*, hex::*};
#[derive(PartialEq, Debug)]
pub struct m {
pub addr: u64,
pub length: usize,
}
impl ParseCommand for m {
fn parse(mut bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
None
} else {
let mut iter = bytes.split_mut(|c| *c == b',');
let addr = iter.next().and_then(|x| decode_hex(x).ok())?;
let length = iter.next().and_then(|x| decode_hex(x).ok())?;
Some(m { addr, length })
}
}
}

View file

@ -0,0 +1,36 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use crate::gdbstub::{commands::*, hex::*};
#[derive(PartialEq, Debug)]
pub struct M {
pub addr: u64,
pub length: usize,
pub vals: Vec<u8>,
}
impl ParseCommand for M {
fn parse(mut bytes: BytesMut) -> Option<Self> {
let mut iter = bytes.split_mut(|c| *c == b',' || *c == b':');
let addr = iter.next()?;
let len = iter.next()?;
let j = 2 + addr.len() + len.len();
let addr = decode_hex(addr).ok()?;
let len = decode_hex(len).ok()?;
let vals = bytes.split_off(j);
Some(M {
addr,
length: len,
vals: decode_hex_string(&vals).ok()?,
})
}
}

View file

@ -0,0 +1,14 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
pub struct p {
pub reg_id: usize,
}

View file

@ -0,0 +1,16 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::Bytes;
pub struct P {
pub reg_id: usize,
pub val: Bytes, // could val size >= sizeof(usize)? SSE3?
}

View file

@ -0,0 +1,30 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct qAttached {
pub pid: Option<i32>,
}
impl ParseCommand for qAttached {
fn parse(mut bytes: BytesMut) -> Option<Self> {
if !bytes.starts_with(b":") {
None
} else {
let mut iter = bytes.split_mut(|c| *c == b':');
let _ = iter.next()?;
Some(qAttached {
pid: iter.next().and_then(|x| decode_hex(x).ok()),
})
}
}
}

View file

@ -0,0 +1,20 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::commands::*;
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct qC {}
impl ParseCommand for qC {
fn parse(bytes: BytesMut) -> Option<Self> {
if !bytes.is_empty() { None } else { Some(qC {}) }
}
}

View file

@ -0,0 +1,28 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::{Bytes, BytesMut};
#[derive(PartialEq, Debug)]
pub struct qSupported {
pub features: Bytes, // use Features type here!
}
impl ParseCommand for qSupported {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
None
} else {
Some(qSupported {
features: bytes.freeze(),
})
}
}
}

View file

@ -0,0 +1,51 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use crate::gdbstub::{commands::*, hex::*};
#[derive(PartialEq, Debug)]
pub enum qXfer {
FeaturesRead { offset: usize, len: usize },
AuxvRead { offset: usize, len: usize },
}
impl ParseCommand for qXfer {
fn parse(mut bytes: BytesMut) -> Option<Self> {
if bytes.starts_with(b":features:read:") {
let mut iter =
bytes[b":features:read:".len()..].split_mut(|c| *c == b':' || *c == b',');
let annex = iter.next()?;
if annex != b"target.xml" {
return None;
}
let offset = iter.next()?;
let len = iter.next()?;
Some(qXfer::FeaturesRead {
offset: decode_hex(offset).ok()?,
len: decode_hex(len).ok()?,
})
} else if bytes.starts_with(b":auxv:read:") {
let mut iter = bytes[b":auxv:read:".len()..].split_mut(|c| *c == b':' || *c == b',');
let annex = iter.next()?;
if annex != b"" {
return None;
}
let offset = iter.next()?;
let len = iter.next()?;
Some(qXfer::AuxvRead {
offset: decode_hex(offset).ok()?,
len: decode_hex(len).ok()?,
})
} else {
None
}
}
}

View file

@ -0,0 +1,24 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct qfThreadInfo;
impl ParseCommand for qfThreadInfo {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
Some(qfThreadInfo)
} else {
None
}
}
}

View file

@ -0,0 +1,24 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct qsThreadInfo;
impl ParseCommand for qsThreadInfo {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
Some(qsThreadInfo)
} else {
None
}
}
}

View file

@ -0,0 +1,15 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
pub struct s {
pub addr: Option<u64>,
}

View file

@ -0,0 +1,15 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
pub struct T {
pub thread: ThreadId,
}

View file

@ -0,0 +1,76 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use nix::sys::signal::Signal;
use crate::gdbstub::{commands::*, hex::*};
#[derive(PartialEq, Debug)]
pub enum vCont {
Query,
Actions(Vec<(ResumeAction, ThreadId)>),
}
impl ParseCommand for vCont {
fn parse(mut bytes: BytesMut) -> Option<Self> {
if bytes == b"?"[..] {
Some(vCont::Query)
} else if bytes.is_empty() {
None
} else {
let mut bytes = bytes.split_off(1);
// example packet: $vCont;s:p3e86d3.3e86d3;c:p3e86d3.-1#3b
// with prefix (`$vCont`) and checksum stripped.
let actions: Vec<(ResumeAction, ThreadId)> = bytes
.split_mut(|c| *c == b';')
.filter_map(|act| {
let mut iter = act.split_mut(|c| *c == b':');
let action = iter.next()?;
let thread_id = iter.next().and_then(|tid| ThreadId::decode(tid))?;
let action = if action.is_empty() {
None
} else {
match action[0] {
b'c' => Some(ResumeAction::Continue(None)),
b'C' => {
let sig = decode_hex::<i32>(&action[1..])
.ok()
.and_then(|s| Signal::try_from(s).ok())?;
Some(ResumeAction::Continue(Some(sig)))
}
b's' => Some(ResumeAction::Step(None)),
b'S' => {
let sig = decode_hex::<i32>(&action[1..])
.ok()
.and_then(|s| Signal::try_from(s).ok())?;
Some(ResumeAction::Step(Some(sig)))
}
b't' => Some(ResumeAction::Stop),
b'r' => {
let mut iter = action[1..].split_mut(|c| *c == b',');
let start: u64 = iter.next().and_then(|x| decode_hex(x).ok())?;
let end: u64 = iter.next().and_then(|x| decode_hex(x).ok())?;
Some(ResumeAction::StepUntil(start, end))
}
_ => None,
}
}?;
Some((action, thread_id))
})
.collect();
if actions.is_empty() {
None
} else {
Some(vCont::Actions(actions))
}
}
}
}

View file

@ -0,0 +1,149 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSDstyle license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use std::os::unix::ffi::OsStringExt;
use std::{ffi::OsString, path::PathBuf};
use nix::sys::stat::FileStat;
use reverie::Pid;
use crate::gdbstub::{commands::*, hex::*};
/// struct stat defined by gdb host i/o packet. This is *not* the same as
/// libc::stat or nix's FileStat (which is just libc::stat).
// NB: packed is needed to force size_of::<HostioStat> == 0x40. Otherwise
// gdb (client) would complain.
#[repr(packed(4))]
pub struct HostioStat {
st_dev: u32,
st_ino: u32,
st_mode: u32,
st_nlink: u32,
st_uid: u32,
st_gid: u32,
st_rdev: u32,
st_size: u64,
st_blksize: u64,
st_blocks: u64,
st_atime: u32,
st_mtime: u32,
st_ctime: u32,
}
impl From<FileStat> for HostioStat {
fn from(stat: FileStat) -> HostioStat {
HostioStat {
st_dev: stat.st_dev as u32,
st_ino: stat.st_ino as u32,
st_nlink: stat.st_nlink as u32,
st_mode: stat.st_mode as u32,
st_uid: stat.st_uid,
st_gid: stat.st_gid,
st_rdev: stat.st_rdev as u32,
st_size: stat.st_size as u64,
st_blksize: stat.st_blksize as u64,
st_blocks: stat.st_blocks as u64,
st_atime: stat.st_atime as u32,
st_mtime: stat.st_mtime as u32,
st_ctime: stat.st_ctime as u32,
}
}
}
#[derive(PartialEq, Debug)]
pub enum vFile {
Setfs(Option<i32>),
Open(PathBuf, i32, u32),
Close(i32),
Pread(i32, isize, isize),
Pwrite(i32, isize, Vec<u8>),
Fstat(i32),
Unlink(PathBuf),
Readlink(PathBuf),
}
impl ParseCommand for vFile {
fn parse(mut bytes: BytesMut) -> Option<Self> {
if bytes.starts_with(b":setfs:") {
let pid: i32 = decode_hex(&bytes[b":setfs:".len()..]).ok()?;
Some(vFile::Setfs(if pid == 0 { None } else { Some(pid) }))
} else if bytes.starts_with(b":open:") {
let mut iter = bytes[b":open:".len()..].split_mut(|c| *c == b',');
let fname = iter.next().and_then(|s| decode_hex_string(s).ok())?;
let fname = PathBuf::from(OsString::from_vec(fname));
let flags = iter.next().and_then(|s| decode_hex(s).ok())?;
let mode = iter.next().and_then(|s| decode_hex(s).ok())?;
Some(vFile::Open(fname, flags, mode))
} else if bytes.starts_with(b":close:") {
let fd: i32 = decode_hex(&bytes[b":close:".len()..]).ok()?;
Some(vFile::Close(fd))
} else if bytes.starts_with(b":pread:") {
let mut iter = bytes[b":pread:".len()..].split_mut(|c| *c == b',');
let fd = iter.next().and_then(|s| decode_hex(s).ok())?;
let count = iter.next().and_then(|s| decode_hex(s).ok())?;
let offset = iter.next().and_then(|s| decode_hex(s).ok())?;
Some(vFile::Pread(fd, count, offset))
} else if bytes.starts_with(b":pwrite:") {
let mut iter = bytes[b":pwrite:".len()..].split_mut(|c| *c == b',');
let fd = iter.next().and_then(|s| decode_hex(s).ok())?;
let offset = iter.next().and_then(|s| decode_hex(s).ok())?;
let bytes = iter.next().and_then(|s| decode_binary_string(s).ok())?;
Some(vFile::Pwrite(fd, offset, bytes))
} else if bytes.starts_with(b":fstat:") {
let fd: i32 = decode_hex(&bytes[b":fstat:".len()..]).ok()?;
Some(vFile::Fstat(fd))
} else if bytes.starts_with(b":unlink:") {
let fname = bytes.split_off(b":unlink:".len());
let fname = decode_hex_string(&fname).ok()?;
let fname = PathBuf::from(OsString::from_vec(fname));
Some(vFile::Unlink(fname))
} else if bytes.starts_with(b":readlink:") {
let fname = bytes.split_off(b":readlink:".len());
let fname = decode_hex_string(&fname).ok()?;
let fname = PathBuf::from(OsString::from_vec(fname));
Some(vFile::Readlink(fname))
} else {
None
}
}
}
#[cfg(test)]
mod test {
use super::*;
use std::mem;
#[test]
fn hostio_stat_size_check() {
assert_eq!(mem::size_of::<HostioStat>(), 0x40);
}
#[test]
fn hostio_sanity() {
// NB: `vFile` prefix is stripped prior.
assert_eq!(
vFile::parse(BytesMut::from(&b":open:6a7573742070726f62696e67,0,1c0"[..])),
Some(vFile::Open(PathBuf::from("just probing"), 0x0, 0x1c0))
);
assert_eq!(
vFile::parse(BytesMut::from(&b":pread:b,1000,0"[..])),
Some(vFile::Pread(0xb, 0x1000, 0x0))
);
assert_eq!(
vFile::parse(BytesMut::from(&b":unlink:6a7573742070726f62696e67"[..])),
Some(vFile::Unlink(PathBuf::from("just probing")))
);
assert_eq!(
vFile::parse(BytesMut::from(&b":readlink:6a7573742070726f62696e67"[..])),
Some(vFile::Readlink(PathBuf::from("just probing")))
);
}
}

View file

@ -0,0 +1,31 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use crate::gdbstub::{commands::*, hex::*};
use reverie::Pid;
#[derive(PartialEq, Debug)]
pub struct vKill {
pub pid: Pid,
}
impl ParseCommand for vKill {
fn parse(bytes: BytesMut) -> Option<Self> {
if !bytes.starts_with(b";") {
None
} else {
let pid = decode_hex(&bytes[1..]).ok()?;
Some(vKill {
pid: Pid::from_raw(pid),
})
}
}
}

View file

@ -0,0 +1,72 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use crate::gdbstub::{commands::*, hex::*};
#[derive(PartialEq, Debug)]
pub struct X {
pub addr: u64,
pub length: usize,
pub vals: Vec<u8>,
}
impl ParseCommand for X {
fn parse(mut bytes: BytesMut) -> Option<Self> {
let mut first_colon = None;
let mut index = 0;
for &b in &bytes {
if b == b':' {
first_colon = Some(index);
break;
} else {
index += 1;
}
}
let (addr_len, vals) = bytes.split_at_mut(first_colon?);
let mut iter = addr_len.split_mut(|c| *c == b',');
let addr = iter.next().and_then(|s| decode_hex(s).ok())?;
let len = iter.next().and_then(|s| decode_hex(s).ok())?;
Some(X {
addr,
length: len,
vals: decode_binary_string(&vals[1..]).ok()?,
})
}
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn can_parse_X_special() {
// Sending packet: $X216eb0,4:,\000\000\000#ae...Packet received: OK
assert_eq!(
X::parse(BytesMut::from("216eb0,4:,\0\0\0")),
Some(X {
addr: 0x216eb0,
length: 4,
vals: vec![0x2c, 0x0, 0x0, 0x0],
})
);
// Sending packet: $X216eb0,4::\000\000\000#bc...Packet received: OK
assert_eq!(
X::parse(BytesMut::from("216eb0,4::\0\0\0")),
Some(X {
addr: 0x216eb0,
length: 4,
vals: vec![0x3a, 0x0, 0x0, 0x0],
})
);
}
}

View file

@ -0,0 +1,33 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use crate::gdbstub::{commands::*, hex::*};
#[derive(PartialEq, Debug)]
pub struct z {
pub ty: BreakpointType,
pub addr: u64,
pub kind: u8,
}
impl ParseCommand for z {
fn parse(mut bytes: BytesMut) -> Option<Self> {
let mut iter = bytes.split_mut(|c| *c == b',');
let ty = iter
.next()
.and_then(|s| decode_hex(s).ok())
.and_then(BreakpointType::new)?;
let addr = iter.next().and_then(|s| decode_hex(s).ok())?;
let kind = iter.next().and_then(|s| decode_hex(s).ok())?;
Some(z { ty, addr, kind })
}
}

View file

@ -0,0 +1,33 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use bytes::BytesMut;
use crate::gdbstub::{commands::*, hex::*};
#[derive(PartialEq, Debug)]
pub struct Z {
pub ty: BreakpointType,
pub addr: u64,
pub kind: u8,
// NB: conditional bkpt here?
}
impl ParseCommand for Z {
fn parse(mut bytes: BytesMut) -> Option<Self> {
let mut iter = bytes.split_mut(|c| *c == b',');
let ty = iter
.next()
.and_then(|s| decode_hex(s).ok())
.and_then(BreakpointType::new)?;
let addr = iter.next().and_then(|s| decode_hex(s).ok())?;
let kind = iter.next().and_then(|s| decode_hex(s).ok())?;
Some(Z { ty, addr, kind })
}
}

View file

@ -0,0 +1,64 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
mod _QuestionMark;
//mod _c;
mod _d_upper;
mod _g;
mod _g_upper;
mod _h_upper;
//mod _k;
mod _m;
mod _m_upper;
//mod _p;
//mod _p_upper;
mod _QStartNoAckMode;
mod _QThreadEvents;
mod _qAttached;
mod _qC;
mod _qSupported;
mod _qXfer;
mod _qfThreadInfo;
mod _qsThreadInfo;
//mod _s;
//mod _t_upper;
mod _vCont;
mod _vFile;
mod _vKill;
mod _x_upper;
mod _z;
mod _z_upper;
pub use _QuestionMark::*;
//pub use _c::*;
pub use _d_upper::*;
pub use _g::*;
pub use _g_upper::*;
pub use _h_upper::*;
//pub use _k::*;
pub use _m::*;
pub use _m_upper::*;
//pub use _p::*;
//pub use _p_upper::*;
pub use _QStartNoAckMode::*;
pub use _QThreadEvents::*;
pub use _qAttached::*;
pub use _qC::*;
pub use _qSupported::*;
pub use _qXfer::*;
pub use _qfThreadInfo::*;
pub use _qsThreadInfo::*;
//pub use _s::*;
//pub use _t_upper::*;
pub use _vCont::*;
pub use _vFile::*;
pub use _vKill::*;
pub use _x_upper::*;
pub use _z::*;
pub use _z_upper::*;

View file

@ -0,0 +1,24 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct ExclamationMark;
impl ParseCommand for ExclamationMark {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
Some(ExclamationMark)
} else {
None
}
}
}

View file

@ -0,0 +1,27 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
#[derive(Debug, PartialEq)]
pub struct QDisableRandomization {
pub val: bool,
}
impl ParseCommand for QDisableRandomization {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes == ":0" {
Some(QDisableRandomization { val: false })
} else if bytes == ":1" {
Some(QDisableRandomization { val: true })
} else {
None
}
}
}

View file

@ -0,0 +1,16 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::Bytes;
pub struct QEnvironmentHexEncoded {
pub key: Bytes,
pub value: Option<Bytes>,
}

View file

@ -0,0 +1,12 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
pub struct QEnvironmentReset;

View file

@ -0,0 +1,15 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::Bytes;
pub struct QEnvironmentUnset {
pub key: Bytes,
}

View file

@ -0,0 +1,15 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::Bytes;
pub struct QSetWorkingDir {
pub dir: Option<Bytes>,
}

View file

@ -0,0 +1,15 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
pub struct QStartupWithShell {
pub val: bool,
}

View file

@ -0,0 +1,13 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
pub struct R;

View file

@ -0,0 +1,16 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
use reverie::Pid;
pub struct vAttach {
pub pid: Pid,
}

View file

@ -0,0 +1,16 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::Bytes;
pub struct vRun {
pub filename: Option<Bytes>,
pub args: Bytes, // use Args type here!
}

View file

@ -0,0 +1,30 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
mod _ExclamationMark;
mod _QDisableRandomization;
//mod _QEnvironmentHexEncoded;
//mod _QEnvironmentReset;
//mod _QEnvironmentUnset;
//mod _QSetWorkingDir;
//mod _QStartupWithShell;
//mod _r_upper;
//mod _vAttach;
//mod _vRun;
pub use _ExclamationMark::*;
pub use _QDisableRandomization::*;
//pub use _QEnvironmentHexEncoded::*;
//pub use _QEnvironmentReset::*;
//pub use _QEnvironmentUnset::*;
//pub use _QSetWorkingDir::*;
//pub use _QStartupWithShell::*;
//pub use _r_upper::*;
//pub use _vAttach::*;
//pub use _vRun::*;

View file

@ -0,0 +1,589 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
#![allow(non_snake_case, non_camel_case_types, dead_code, unused_imports)]
use crate::gdbstub::{
hex::*, request::*, response::*, BreakpointType, Inferior, InferiorThreadId, ResumeInferior,
StoppedInferior,
};
use crate::trace::{ChildOp, Stopped};
use bytes::{Bytes, BytesMut};
use paste::paste;
use std::{collections::BTreeMap, path::PathBuf};
use thiserror::Error;
use tokio::sync::{broadcast, mpsc, oneshot};
use reverie::{ExitStatus, Pid, Signal};
mod base;
mod extended_mode;
mod monitor_cmd;
mod section_offsets;
pub use base::*;
pub use extended_mode::*;
pub use monitor_cmd::*;
pub use section_offsets::*;
trait ParseCommand: Sized {
fn parse(buff: BytesMut) -> Option<Self>;
}
#[derive(PartialEq, Debug, Clone, Copy)]
pub enum IdKind {
// all threads: `-1'.
All,
// any thread: `0'.
Any,
Id(Pid),
}
impl IdKind {
pub fn from_raw(pid: i32) -> Self {
match pid {
-1 => IdKind::All,
0 => IdKind::Any,
_ => IdKind::Id(Pid::from_raw(pid)),
}
}
#[allow(clippy::wrong_self_convention)]
pub fn into_raw(&self) -> i32 {
match self {
IdKind::All => -1,
IdKind::Any => 0,
IdKind::Id(pid) => pid.as_raw(),
}
}
pub fn matches(&self, other: &IdKind) -> bool {
match (self, &other) {
(IdKind::All, _) => true,
(IdKind::Any, _) => true,
(IdKind::Id(pid1), IdKind::Id(pid2)) => pid1 == pid2,
(IdKind::Id(_), _) => other.matches(self),
}
}
}
impl WriteResponse for IdKind {
fn write_response(&self, writer: &mut ResponseWriter) {
match self {
IdKind::All => writer.put_str("-1"),
IdKind::Any => writer.put_str("0"),
IdKind::Id(pid) => writer.put_num(pid.as_raw()),
}
}
}
#[derive(PartialEq, Debug)]
pub enum ThreadOp {
c, // step and continue, deprecated because of `vCont'
g, // Other operations
G,
m,
M,
}
/// Gdb ThreadId. See https://sourceware.org/gdb/onlinedocs/gdb/Packets.html#thread_002did-syntax
/// for more details.
#[derive(PartialEq, Debug, Clone, Copy)]
pub struct ThreadId {
pub pid: IdKind,
pub tid: IdKind,
}
impl ThreadId {
pub fn all() -> Self {
ThreadId {
tid: IdKind::All,
pid: IdKind::All,
}
}
pub fn any() -> Self {
ThreadId {
tid: IdKind::All,
pid: IdKind::Any,
}
}
pub fn pid(pid: i32) -> Self {
ThreadId {
tid: IdKind::All,
pid: IdKind::from_raw(pid),
}
}
pub fn pid_tid(pid: i32, tid: i32) -> Self {
ThreadId {
pid: IdKind::from_raw(pid),
tid: IdKind::from_raw(tid),
}
}
// NB: Specifying just a process, as ppid, is equivalent to ppid.-1.
pub fn decode(bytes: &[u8]) -> Option<Self> {
if !bytes.starts_with(b"p") {
return None;
}
let mut iter = bytes[1..].split(|c| *c == b'.');
let p = iter.next().and_then(|x| decode_hex(x).ok())?;
Some(
match iter.next().and_then(|x| {
if x == &b"-1"[..] {
Some(-1)
} else {
decode_hex(x).ok()
}
}) {
Some(t) => ThreadId::pid_tid(p, t),
None => ThreadId::pid(p),
},
)
}
/// Check if `tid` matches `ThreadId`.
pub fn matches(&self, other: &ThreadId) -> bool {
self.pid.matches(&other.pid) && self.tid.matches(&other.tid)
}
pub fn getpid(&self) -> Option<Pid> {
let id = self.pid.into_raw();
if id > 0 {
Some(Pid::from_raw(id))
} else {
None
}
}
pub fn gettid(&self) -> Option<Pid> {
let id = self.tid.into_raw();
if id > 0 {
Some(Pid::from_raw(id))
} else {
None
}
}
}
impl WriteResponse for ThreadId {
fn write_response(&self, writer: &mut ResponseWriter) {
writer.put_str("p");
self.pid.write_response(writer);
writer.put_str(".");
self.tid.write_response(writer);
}
}
macro_rules! commands {
(
$(#[$attrs:meta])*
$vis:vis enum $Name:ident {
$(
$(#[$ext_attrs:meta])*
$ext:ident {
$($name:literal => $command:ident,)*
}
)*
}
) => {paste! {
$(
#[allow(non_camel_case_types)]
#[derive(PartialEq, Debug)]
$(#[$ext_attrs])*
$vis enum [<$ext:camel>] {
$($command(self::$ext::$command),)*
}
)*
/// GDB commands
$(#[$attrs])*
$vis enum $Name {
$(
[<$ext:camel>]([<$ext:camel>]),
)*
Unknown(Bytes),
}
impl Command {
pub fn try_parse(
mut buf: BytesMut
) -> Result<Command, CommandParseError> {
if buf.is_empty() {
return Err(CommandParseError::Empty);
}
let body = buf.as_ref();
$(
match body {
$(_ if body.starts_with($name.as_bytes()) => {
let nb = $name.as_bytes().len();
let cmd = self::$ext::$command::parse(buf.split_off(nb))
.ok_or(CommandParseError::MalformedCommand(String::from(concat!($name))))?;
return Ok(
Command::[<$ext:camel>](
[<$ext:camel>]::$command(cmd)
)
)
})*
_ => {},
}
)*
Ok(Command::Unknown(buf.freeze()))
}
}
}};
}
/// Command parse error
#[derive(Debug, PartialEq, Error)]
pub enum CommandParseError {
/// Command is empty
#[error("Command is empty")]
Empty,
/// Malformed command
#[error("Malformed command: {}", .0)]
MalformedCommand(String),
#[error("Malformed registers found from g/G packet")]
MalformedRegisters,
}
commands! {
#[derive(PartialEq, Debug)]
pub enum Command {
base {
"?" => QuestionMark,
"D" => D,
"g" => g,
"G" => G,
"H" => H,
"m" => m,
"M" => M,
"qAttached" => qAttached,
"QThreadEvents" => QThreadEvents,
"qC" => qC,
"qfThreadInfo" => qfThreadInfo,
"QStartNoAckMode" => QStartNoAckMode,
"qsThreadInfo" => qsThreadInfo,
"qSupported" => qSupported,
"qXfer" => qXfer,
"vCont" => vCont,
"vKill" => vKill,
"z" => z,
"Z" => Z,
"X" => X,
/* host i/o commands */
"vFile" => vFile,
}
extended_mode {
"!" => ExclamationMark,
"QDisableRandomization" => QDisableRandomization,
}
monitor_cmd {
"qRcmd" => qRcmd,
}
section_offsets {
"qOffsets" => qOffsets,
}
}
}
/// Resume actions set by vCont.
#[derive(PartialEq, Clone, Copy, Debug)]
pub enum ResumeAction {
/// signal step, with optional signal.
Step(Option<Signal>),
/// cointinue, with optional signal.
Continue(Option<Signal>),
/// Stop, not sure what it means exactly.
Stop,
/// Keep stepping until rip doesn't belong to start..=end.
StepUntil(u64, u64),
}
/// Replay log used by reverse debugging.
#[derive(PartialEq, Clone, Copy, Debug)]
pub enum ReplayLog {
/// Relay log reached the beginning.
Begin,
/// Replay log reached the end.
End,
}
/// Expediated registers. Stop reply packets (as to vCont) can have extra
/// registers, so that gdb doesn't have to read registers unless necessary.
/// On amd64, they're %rbp, %rsp and %rip.
#[derive(PartialEq, Clone, Debug)]
pub struct ExpediatedRegs(BTreeMap<usize, u64>);
impl From<libc::user_regs_struct> for ExpediatedRegs {
fn from(regs: libc::user_regs_struct) -> Self {
let mut exp_regs = BTreeMap::new();
exp_regs.insert(6, regs.rbp);
exp_regs.insert(7, regs.rsp);
exp_regs.insert(0x10, regs.rip);
ExpediatedRegs(exp_regs)
}
}
#[derive(PartialEq, Clone, Debug)]
pub enum StopEvent {
/// Stopped by signal.
Signal(Signal),
/// Stopped by softwrae breakpoint.
SwBreak,
/// Stopped due to vforkdone event.
Vforkdone,
/// Replay reached either begin or end.
ReplayDone(ReplayLog),
/// Stopped due to exec event.
Exec(PathBuf),
}
#[derive(Debug, Clone)]
pub struct StoppedTask {
/// Pid of the event (SYS_gettid)
pub pid: Pid,
/// Thread Group id of the event (SYS_getpid)
pub tgid: Pid,
/// Stop event
pub event: StopEvent,
/// Expediated registers specified by gdb remote protocol
pub regs: ExpediatedRegs,
}
#[derive(Debug)]
pub struct NewTask {
/// Pid of the event (SYS_gettid)
pub pid: Pid,
/// Thread Group id of the event (SYS_getpid)
pub tgid: Pid,
/// New child Pid
pub child: Pid,
/// Expediated registers specified by gdb remote protocol
pub regs: ExpediatedRegs,
/// Clone type
pub op: ChildOp,
/// channel to send gdb request
pub request_tx: Option<mpsc::Sender<GdbRequest>>,
/// channel to send gdb resume request
pub resume_tx: Option<mpsc::Sender<ResumeInferior>>,
/// channel to receive new gdb stop event
pub stop_rx: Option<mpsc::Receiver<StoppedInferior>>,
}
/// Reasons why inferior has stopped, reported to gdb (client).
/// See section ["Stop Reply Packets"]
/// (https://sourceware.org/gdb/onlinedocs/gdb/Stop-Reply-Packets.html#Stop-Reply-Packets)
/// for more details.
#[derive(Debug)]
pub enum StopReason {
Stopped(StoppedTask),
NewTask(NewTask),
Exited(Pid, ExitStatus),
ThreadExited(Pid, Pid, ExitStatus),
}
impl StopReason {
pub fn stopped(pid: Pid, tgid: Pid, event: StopEvent, regs: ExpediatedRegs) -> Self {
StopReason::Stopped(StoppedTask {
pid,
tgid,
event,
regs,
})
}
// FIXME: Reduce number of arguments.
#[allow(clippy::too_many_arguments)]
pub fn new_task(
pid: Pid,
tgid: Pid,
child: Pid,
regs: ExpediatedRegs,
op: ChildOp,
request_tx: Option<mpsc::Sender<GdbRequest>>,
resume_tx: Option<mpsc::Sender<ResumeInferior>>,
stop_rx: Option<mpsc::Receiver<StoppedInferior>>,
) -> Self {
StopReason::NewTask(NewTask {
pid,
tgid,
child,
regs,
op,
request_tx,
resume_tx,
stop_rx,
})
}
pub fn thread_exited(pid: Pid, tgid: Pid, exit_status: ExitStatus) -> Self {
StopReason::ThreadExited(pid, tgid, exit_status)
}
pub fn exit(pid: Pid, exit_status: ExitStatus) -> Self {
StopReason::Exited(pid, exit_status)
}
}
impl WriteResponse for StopReason {
fn write_response(&self, writer: &mut ResponseWriter) {
match self {
StopReason::NewTask(new_task) => {
writer.put_str("T05");
match new_task.op {
ChildOp::Fork => {
// T05fork:p21feb6.21feb6;06:30dcffffff7f0000;07:10dcffffff7f0000;10:37c2ecf7ff7f0000;thread:p21f994.21f994;core:10;
let thread_id = ThreadId {
pid: IdKind::from_raw(new_task.child.as_raw()),
tid: IdKind::from_raw(new_task.child.as_raw()),
};
writer.put_str("fork:");
thread_id.write_response(writer);
writer.put_str(";");
}
ChildOp::Vfork => {
let thread_id = ThreadId {
pid: IdKind::from_raw(new_task.child.as_raw()),
tid: IdKind::from_raw(new_task.child.as_raw()),
};
writer.put_str("vfork:");
thread_id.write_response(writer);
writer.put_str(";");
}
ChildOp::Clone => {
let thread_id = ThreadId {
pid: IdKind::from_raw(new_task.tgid.as_raw()),
tid: IdKind::from_raw(new_task.child.as_raw()),
};
writer.put_str("create:");
thread_id.write_response(writer);
writer.put_str(";");
}
}
for (regno, regval) in &new_task.regs.0 {
writer.put_num(*regno);
writer.put_str(":");
writer.put_hex_encoded(&regval.to_ne_bytes());
writer.put_str(";");
}
let thread_id = ThreadId::pid_tid(new_task.tgid.as_raw(), new_task.pid.as_raw());
writer.put_str("thread:");
thread_id.write_response(writer);
writer.put_str(";");
}
StopReason::Stopped(stopped) => {
writer.put_str("T05");
match &stopped.event {
StopEvent::Signal(_) => {}
StopEvent::SwBreak => {
writer.put_str("swbreak:;");
}
StopEvent::Vforkdone => {
writer.put_str("vforkdone:;");
}
StopEvent::Exec(p) => {
// T05exec:2f746d702f6631;06:0000000000000000;07:80ddffffff7f0000;10:9030fdf7ff7f0000;thread:p350ad8.350ad8;core:9;
writer.put_str("exec:");
if let Some(p) = p.to_str() {
writer.put_hex_encoded(p.as_bytes());
}
writer.put_str(";");
}
StopEvent::ReplayDone(log) => match log {
ReplayLog::Begin => writer.put_str("replaylog:begin;"),
ReplayLog::End => writer.put_str("replaylog:end;"),
},
}
for (regno, regval) in &stopped.regs.0 {
writer.put_num(*regno);
writer.put_str(":");
writer.put_hex_encoded(&regval.to_ne_bytes());
writer.put_str(";");
}
let thread_id = ThreadId::pid_tid(stopped.tgid.as_raw(), stopped.pid.as_raw());
writer.put_str("thread:");
thread_id.write_response(writer);
writer.put_str(";");
}
StopReason::Exited(pid, exit_status) => {
match exit_status {
ExitStatus::Exited(code) => {
writer.put_str("W");
writer.put_hex_encoded(&[*code as u8]);
}
ExitStatus::Signaled(sig, _) => {
writer.put_str("X");
writer.put_hex_encoded(&[(*sig as u8) | 0x80]);
}
}
writer.put_str(";process:");
writer.put_num(pid.as_raw());
}
StopReason::ThreadExited(pid, tgid, exit_status) => {
match exit_status {
ExitStatus::Exited(code) => {
writer.put_str("w");
writer.put_hex_encoded(&[*code as u8]);
}
ExitStatus::Signaled(_, _) => unreachable!(),
}
writer.put_str(";");
let threadid = ThreadId::pid_tid(tgid.as_raw(), pid.as_raw());
threadid.write_response(writer);
}
}
}
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn decode_vcont_test() {
let mut packet = BytesMut::from("$vCont;s:p3e86d3.3e86d3;c:p3e86d3.-1#3b");
let vcont = vCont::parse(packet.split());
assert!(vcont.is_some());
let vcont = vCont::parse(BytesMut::from("$vCont;c:p2.-1#10"));
assert!(vcont.is_some());
}
#[test]
fn unknown_command() {
let mut packet = BytesMut::from("just,an,unknown,command#3b");
let cmd = Command::try_parse(packet.split());
assert!(cmd.is_ok());
assert!(matches!(cmd.unwrap(), Command::Unknown(_)));
}
#[test]
fn malformed_command() {
let mut packet = BytesMut::from("vCont,Just a bad command;c:1.-1#fe");
let cmd = Command::try_parse(packet.split());
assert_eq!(
cmd,
Err::<Command, CommandParseError>(CommandParseError::MalformedCommand(String::from(
"vCont"
)))
);
}
}

View file

@ -0,0 +1,28 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::{Bytes, BytesMut};
#[derive(PartialEq, Debug)]
pub struct qRcmd {
pub cmd: Bytes,
}
impl ParseCommand for qRcmd {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
None
} else {
Some(qRcmd {
cmd: bytes.freeze(),
})
}
}
}

View file

@ -0,0 +1,12 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
mod _qRcmd;
pub use _qRcmd::*;

View file

@ -0,0 +1,24 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree.
*/
use crate::gdbstub::{commands::*, hex::*};
use bytes::BytesMut;
#[derive(PartialEq, Debug)]
pub struct qOffsets;
impl ParseCommand for qOffsets {
fn parse(bytes: BytesMut) -> Option<Self> {
if bytes.is_empty() {
Some(qOffsets)
} else {
None
}
}
}

Some files were not shown because too many files have changed in this diff Show more